Both government regulators and operations personnel have made careers out of trusting black boxes that no one fully understands anymore.
Regulatory specifications are ambiguous and disorganized. Imagine working with someone who knows that you are trying to shed light on the technology that they were responsible for, yet do not fully understand. How cooperative do you think these people will be with your efforts? How do you think the managers responsible for this work survive the politics of the project's inevitable failure? Blame rolls downhill. Modernization projects require strong leadership and collaboration across many teams. That's a huge problem because such an environment very likely does not exist in the world today. No one understands how the algorithms work or even how they should work! Forget how challenging it is to work with COBOL, which someone could eventually figure out. Whatever you've written has to be tested, and you don't have a trusted source of business logic from which to verify how it ought to work. You only have the black box.
If you've been given this task, start interviewing at other companies. You were given a suicide mission by people who are well aware of that.
My experience with this was with regulatory margin trading systems supporting a multi-billion dollar market.
I have been on both sides of this and it sucks. I came up with Rug Driven Development as a title. All the risks and future problems are swept under the rug quickly while maintaining a happy smile and pretending everything is just fine and dandy.
I so need to steal this term. Not exactly for what you are describing even if I come across this on a nearly weekly basis in so many different industries. Be it automotive, be it finance, be it what ever.
But - and this is were I will probably use it in the future - I come across it in agency work. You win a shiny new project proposal and need to quickly show something for your corporate "partners" to shine before their management. To make them look good for their yearly appraisals or what not.
You already know that you will never fix the underlying shit you built, because your contract will be a new pitch three years down the line and it isn't clear if you will win it again. So let the next people deal with the tech debt - but guess what: They will just do the same. So debt will pile on and on.
So "Rug Driven Development" will be my goto term going forward.
I call such things "Ancient Wonders". As in artifacts from long ago that the company owns, but nobody knows how they work or how they were built, and nobody can build one today. There may, in fact, be only one left. Whoever did build them was privy to some Tribal Knowledge that has since been lost.
I worked at a shop whose Ancient Wonder was their custom templating engine with an embedded Python interpreter. The only guy who knew how to compile it had left years before I joined, and as a result everybody just copied the same Solaris .so to new development/test/prod servers as they came in.
Read this story, "Institutional Memory and Reverse Smuggling"  - once upon a time, a petrochemical factory was built. Decades later, the plant is still operating and being maintained. But at this point, nobody knows how the whole factory worked, why it was built that way, which processes it ran or how it was constructed...
They let this go on for about six years before I arrived and saw it. It took me pointing out that there is a decent chance this is caused by the process relying upon something in the OS kernel that isn't quite aligned to the kernel documentation but is good enough, and the immediate respawn behavior might rely upon that in turn. If the kernel ever is "fixed", then this audit-compliance-related process suddenly stops and doesn't respawn, or worse, won't start at all, becoming top of mind with all the senior management. It is always cheaper to fix problems in the small before they become problems no one can ignore.
The urge to sweep problems under the rug and move on is very powerful within our industry. Until you've done the same yourself and been bitten so many times, your scar tissue twitches every time you see the same pattern again. These days, I treat code problems like I treat cleaning up messes while cooking: I clean up as I go along. My scar tissue thanks me now. There is a delicate balance between addressing these problems in the small and bikeshedding, though.
BTW, I performed an strace that revealed a SIGKILL just pops in without anyone or any known process issuing it. The application developers suspect something in the OS, so we're now engaged with the application support team, OS support team, and our own internal OS support team to trace it down and beat it into submission.
I'll steal this :) Encountered a few of those during my career as well as if you can't see a problems they can't hurt you...
David Edgerton, The Shock of the Old: Technology and Global History Since 1900 (2006).
Andrew L. Russell and Lee Vinsel, The Innovation Delusion: How Our Obsession with the New Has Disrupted the Work That Matters Most (2020).
You won’t agree with everything in those books, but they will make you think deeply about old technologies and the stories we tell ourselves about innovation.
Old code still in production just sounds antifragile to me and that's the sort of code I'd like to write
I think sadly you won’t be able to read the preeminent examples of really old antifragile code that’s still powering the world. That stuff is mostly proprietary - banks and governments and so on. Hence this article.
BUT - there are still plenty of mature open source codebases to read, especially operating systems, databases, and Unix utilities. Sure, they’re living projects that now look quite different from the original, but they’re descended from code a few decades old and written in the same language — C. There’s Emacs (35 yrs old), GCC (33 yrs), the Linux kernel (29 yrs), MySQL (25 yrs), and Postgres (a youngster, only 24).
And then there’s the programming languages. C itself is now 48 years old! Since it’s the language of the Linux Kernel and Postgres and so much else, I’d be willing to bet it’ll be around for another 48 years. (Yes, I love Rust, and I’d use it over C for any new project, but this is the Linux Kernel folks!)
But - since this is Hacker News - I should note that C is beaten by a long long way by Lisp, which is 62 years old. That’s older than COBOL - and people still love it. John McCarthy really had programming figured out.
For those who want to learn it - check out this short guide. Highly recommended language!:
But in terms of real world production code it is barely a footnote whereas COBOL fills entire libraries.
So while it might be great there just isn’t a lot of examples to look at
It's hard to understate how seminal Lisp was.
I don’t disagree but the OP was asking for examples of production code to look at
Maybe you aren't seeing Lisp itself in production code (although someone suggested Emacs which is Lisp), but you see it's ideas and influence in almost every language and program you'd find today. There's little bits of Lisp everywhere you look, even if you don't recognise it as Lisp.
The code base is underdocumented, has no automated tests, and full of security issues waiting to be found. The moment you're off the beaten path, you end up debugging random bugs and regressions. Code quality is generally very high, as one would expect, but the code is not necessarily resilient or well-designed.
SQLite is a much better example of truly antifragile code.
Every 5 years or so, linux distributions will pick up the latest kernel version and ship it. To upgrade you have to reformat and reinstall your computer, there's no continuity of service, there's no upgrade path.
Minor patches to the Debian kernel are released all the time and you get those with a regular "apt upgrade". With Arch and other rolling-release distros, you get major new kernel versions all the time as part of the normal package upgrade flow.
What in the world are you talking about? Distros are constantly pulling versions of Linux kernel from "master" tweaking it a little bit for their system and shipping it. Most at least a few times a year.
Some history on Macsyma:
I'm currently doing a masters in CS as a career change from mechanical engineering and what surprises me is how little time we spend learning to read code. It would be nice to have a Great Works type program for code bases
It ran in many places in the 70s and was quite popular at the time of Turbo Pascal.
TP compiler was created by Anders Hejlsberg, that later became chief architect of Delphi, that convinced Microsoft to hire him to work as chief architect on C# and now he's a core developer on Typescript.
Not sure about the source though.
Delphi was/is used quite a bit professionally.
* Sadly, as others have said, these are proprietary internal systems and members of the public won’t ever be able to see them.
* Although the codebases have ancient lineage, the code doesn’t stay static and has been extended and patched over the years, and sometimes transplanted wholesale to a different technology using automated tools (e.g. mainframe to .NET, although that particular one was an unmitigated disaster for maintainability).
* Ancient code usually persists in production for reasons unrelated to its technical qualities, and usually in spite of them. For example, an insurance company managing a closed book of pensions has very few compelling reasons to change its technology because the product it supports doesn’t change, and regulatory changes can be handled by tweaks and auxiliary reporting systems.
* Having said that, ancient code was usually written with a great deal more discipline than modern code, and I believe this stems from the division of labor between Programmer and Analyst, which no longer exists. By having two people involved in every part of the system, you had all the benefits of rubber ducking, peer-review, documentation, and the simple act of talking to another human about what you are doing.
* Ancient systems tend to be very rigorously tested for functionality that directly supports business activity. Ancillary functionality such as management information reports, less so. These systems change infrequently enough that the major defects have all been flushed out and fixed, over the years. There is very little feature churn, which does wonders for stability.
* The code at the heart of these systems usually wasn’t designed with a modern threat model in terms of security. Security is often bolted-on by isolating these systems behind other systems, and they get a lot of mileage out of obscurity too.
* The code is often the least interesting thing about ancient systems anyway. Few are performing rocket science tasks - they are usually just doing batch data processing and green screens. More interesting is the ecosystem of infrastructure, interfaces, job schedules, copybooks and data models, resilience features, and documentation. These are the things that make the system really work. Lines of COBOL are usually unilluminating.
Shotgun compiling wasn’t a thing and programs were planned on paper (flowcharts) before any code was written.
Modern day programmers scoff as they crank out another web app which will be thrown out and totally rewritten in the fashionable framework of the day in less than a year.
You are right, there was more discipline, and systems, no matter what size they were, were painstakingly described on paper first, then implemented (programmers had very little chance to come up with neat tricks), then tested.
Infrastructure was very "primitive", but on the other hand you basically never had obscure bugs manifesting in your database, or your batch scheduler etc.
On the other hand (I am talking of the late 80s) systems had already grown to a behemoth size, and unfortunately you could already find stuff that was messy and poorly documented.
Take also in account that IT (at least in Europe) had a boom between 70s and 80s... but this also meant that lots of people started working in IT with almost no formal education and without any "craft experience": exactly the same mistakes and cul-de-sacs were independently made and discovered all over the place, and there was basically no way for developer X to know that someone else had already solved those, even if the other person was sitting across the street or two floors above.
It's the environment around critical systems. When shit hits the fan, it's really bad and there's a feedback loop to developers/managers/companies.
Software is ironed out over time. Bugs are removed and things are very stable in finance.
Software that really doesn't work (and can't be fixed) might be killed or never see the light of day. The result of failure is too bad and too visible.
It's way more than that. There's all sort of strange incentives. Like you don't get dinged for fixing a bug instead of making a new features (frequently there's no new features to make, it's largely maintenance work).
But I am interested in the stuff that survives because of how well-written it is. To your point though, the context and ecosystem of these codebases is as important, if not more so, that the code itself. A lot of comments are pointing to the linux kernel which is a tour de force in and of itself, but also really interesting for the community/communities its inspired.
Couple of key comments for me
"The idea that new code is better than old is patently absurd. Old code has been used. It has been tested. Lots of bugs have been found, and they’ve been fixed."
"It’s important to remember that when you start from scratch there is absolutely no reason to believe that you are going to do a better job than you did the first time. First of all, you probably don’t even have the same programming team that worked on version one, so you don’t actually have “more experience”. You’re just going to make most of the old mistakes again, and introduce some new problems that weren’t in the original version."
A similar analogy might be Chesterton's Fence, it's important to know why it's in the state it's in before you decide to alter it?:
An example is Bitcoin's concensus critical code: anything that goes in there probably has to be maintained forever (or for the life of Bitcoin)
I don't think Bitcoin is a perfect example here, the Bitcoin consensus rule is fragile because it's Bitcoin, not because it's old. The unchangeable consensus code is a result due to the fundamental property of Bitcoin itself (every client must run in lockstep, even the slightest deviance cannot be tolerated, which means there should be one and only one Bitcoin client whose behavior shall remain consistent for as long as Bitcoin exists), this outcome is independent from its code quality, age, and other factors.
...Nevertheless, on second thought, the Bitcoin analogy isn't too far from the truth. If an aged legacy system is the dependency of many other systems and serves a critical role, it effectively becomes Bitcoin-like.
Just think about Boeing's case where the company is trying to fit new code into the memory of a 8086 chip.
That is to say, old codebases are mostly about organizational politics and social dynamics, rather than anything inherently stable or high-quality about the codebase itself.
So basically, if you want your codebase to still be in production 50 years from now, here's what you do: build something important and mission critical. It should be something that could cost millions if it ever fails. Make the design so bespoke and specialized that only you and people trained by you actually know how it works and how to fix it. Fraternize with the management and use nepotism to grease the wheels and convince them to overlook the inefficiencies inherent in your project. This works well in government projects where you can offer kickbacks in the form of campaign donations. Once you are providing a necessity, and there is a real financial risk associated with your system failing, you're set for life. Expect your contract to be renewed next year, and every year after that, forever and ever and ever and ever.
I have a copy of The Shock of The Old and I did find it thought-provoking. I had recently become aware of the ideas/discipline of discourse analysis so it was more enlightening than it might have been to someone who wasn't really aware that our narratives are framed (like younger me).
I'll be checking out the Russell and Vinsel.
The first one (Edgerton) is history, with the argument being roughly that we should frame the history of technology not just in terms of inventions/innovations, but also in terms of what was actually in use at a particular moment. (So the early 21st century is about self-driving cars and the iPhone, but also the Haber-Bosch Process and COBOL.)
The second book is more about the present, and is concerned with how we tend to talk too much about innovation and not enough about maintenance. (So they love Right to Repair and Open Source maintainers, among other things.)
I've really never solved a dump since COBOL, which I have not worked in since 1996. Debugging has come a long way.
If banks need to update their software, hire people and train them. It will take time, and time is money. That money could be invested into something else (there is an opportunity cost). If that something else is a higher priority to you, then maintaining the integrity of your critical software is not actually as important to you as you say it is.
I will not take a pay cut to work on COBOL for a bank.
You're probably just not the right type of personality. It seems that COBOL programmers have nice, slow paced, easy and safe jobs. That could be why there's so little hiring for COBOL programmers. Once you get someone, they stay put for decades until they retire. I'm sure there are plenty of programmers out there ready to drop the stressful sleepless start-up jobs and settle into something easy and 9-5, but perhaps don't know how to get from A to B.
When I graduated most of my classmates were hired by the local branch of an international consultancy firm (hint: it's currently involved in a legal feud with the GOP in the US). Pay is below market average, and their job involves dealing with legacy Java and COBOL codebases. The Java part is OK, but they find COBOL very painful to work with. They are also required to do unpaid overtime due to hard deadlines.
On the other hand, a few of us went on to work for startups all over the country, and if you compare our careers ever since, they are like night and day. We can switch jobs easily due to having marketable skills (COBOL is a dead end outside a few companies) and we are making twice as much without working overtime.
Either it’s a really easy language to learn or the banks used to invest much more in teaching it than they do now (= zero).
If COBOL is so domain-specific, why not teach it as such on the job? Why is it the public’s fault for not investing in it?
Very few companies invest in training these days. They expect schools and colleges to teach some professional skills (this is not what they should be doing) and invest as little as possible in new hires, specially for technical jobs.
That's what my comment was saying. According to the article, the banks used to pay for their staff to learn COBOL on the job. It's 100% the financial industries fault that they have allowed this situation to arise. Why did that practice stop? I think this is one of the areas where capitalism is slowly failing in recent decades.
As it requires a deep understanding of how the monolith works and its connections to other applications, it usually needs to be done in conjunction with the support team which may be outsourced too.
I do remember one particular core that was surprising difficult. Everything was encoded in EBCDIC. You had to rotate telnet (!) connections depending on the day of the week. There was one person (a man named Earl) who could help you if you had a problem with the continuous stationary documentation (luckily it was scanned).
Oh the memories...
A separate issue is the Fed exploring creation of “digital currency”, essentially deposit accounts at the central bank. That was encouraged by the threat of Facebook’s Libra and China’s “digital yuan”.
SEPA standards are pretty slick though, built a couple implementations myself for transfer documents (pain.00X) and It Just Works™.
Where did you learn about them enough to be able to implement them? Are the standard(s) public available?
When I show an APL or K gem such as "|/0(0|+)\" (which is a complete and rather efficient implementation of the maximum-subarray-sum problem), people usually complain that "it's unreadable".
But then "ADD ONE TO X GIVING X" compared to "++x" or "x=x+1" shows that explicit is only better than implicit when you're not familiar with the notation.
Readability is all about your expectation and familiarity.
What do you mean by SEP field? (I can't find it in the article either). Is this a programming construct like a field of a record?
I'm not sure why you think "programming languages are moving away from the ++ operator". Care to elaborate?
That's one of the reasons some newer languages like Rust purposefully avoided increment operators, where instead you just do `x += 1` (and the add-assign operator doesn't return anything) or `x = x + 1`.
In C afaik f(g(),h()) is still UB because either g() or h() could be called first. ++ is just another small detail. (And no, I didn’t know C++17 made that behavior defined. Do you have a quick description of how it is defined now?)
“Newer languages like rust” - just say “rust”, unless you have another good example of a non fringe language that adopted most c syntax including += but without ++.
Not all that recently anymore – it was added 20 years ago in Python 2.0:
But the walrus operator, "x := x + 1", was added very recently:
consider the call f(x+=1,x+=1) when x=5; what is the call? f(6,7) or f(7,6)? Iirc even f(6,6) and f(7,7) were valid behavior in the past.
But I suppose you would run into this kind of ambiguity with the walrus operator, i.e.
>>> x = 5
>>> f = lambda a, b: (a, b)
>>> print(f(x := x + 1, x := x + 1))
Or about $112.50/hour.
But this is what your company pays as Corp-to-Corp, to his company.
His gross pay is significantly less. Maybe at $65/hour. So his annual pay might only be $130,000/year.
And while this is decent pay, for middle class wages, it is not really very impressive.
You can skip the hardware purchase by joining the IBM i Hobbyists discord; there are a decent number of people with machines who are happy to give out accounts (myself included)
IBM i is a lesser used platform for COBOL, though; the main one is IBM mainframes (aka MVS and z/OS). To get started there, you could try Master the Mainframe (I don't have a link handy, sorry) or tk4, and get support from the Mainframe discord. I'm focusing on IBM i because I found it significantly easier to get started, and the platform itself has a really unique design that I was interested in learning about.
If you're done that, I presume there's also https://www.coursera.org/professional-certificates/ibm-z-mai... though it's starting to feel even more like marketing after seeing all that.
Next up: learning Ada.
Hence, code written in good programming languages might have a shorter lifetime compared to say... COBOL or PHP code. A bit paradox.
I think there are legitimate criticisms, such as the fact that it’s not portable between architectures, compilers are typically closed source and expensive, code is typically not portable between compilers.
These are all things we mostly expect languages to have gotten past now, so I can understand the feeling that it’s stuck in the past. The thought experiment of what would a DSL for banking look like though does suggest COBOL isn’t too bad.
It feels like this approach would work well with the approach of companies like MicroFocus: compile COBOL to JVM/CLR, and then allow pieces to be replaced with Java/CLR languages as necessary, and allow running on “normal” machines, removing the dependencies on architecture (by emulating in the compiler). That way, these old codebases almost become a specialised VM language, while regular modern languages can be used to augment.
Second, you make it sound as if COBOL would be somehow better than using Java, C#, ... because it is a DSL. But that's not the case - or can you give some more concrete points for why it should be better?
Let me answer as GP and having made that claim. Because COBOL can't really do much anything else ;? Eg have fun implementing an event-driven GUI app (actually there are/were solutions for running COBOL green-screen mainframe apps on browsers, and things like IBM's HATS for running 3270 apps in Java portlets). Though apart from batch processing, I guess COBOL the language but not necessarily the runtime works well also for writing backend service implementation code.
More seriously, as I recall it, COBOL simply has straightforward idioms for arithmetic, date calculations, statically-typed structured file I/O, ISAM file access (and SQL?) as part of the language rather than Java's BigDecimal and various date libraries that all suck in a different way and at best cause enormous fluent-style expressions with cognitive overhead. Plus, COBOL doesn't have reflection and metaprogramming so self-important idiomatic Java code golf is spotted immediately as out of place next to actual business logic.
> I guess COBOL the language but not necessarily the runtime works well also for writing backend service implementation code
I'm sorry to be picky on words here, but as it is the core of the whole discussion: "works well" is quite meaningless if not put into context. "Works better than X" or "has an advantage over X in some way, for example..." is much more fruitful for a discussion about this topic.
> More seriously, as I recall it, COBOL simply has straightforward idioms for arithmetic, date calculations, statically-typed structured file I/O, ISAM file access (and SQL?) as part of the language rather than Java's BigDecimal and various date libraries that all suck in a different way and at best cause enormous fluent-style expressions with cognitive overhead.
I don't want to be ignorant here... and I'm also not a fan of Java. In fact I dislike Java so much that I declined highly paid jobs. However, is COBOL really better in these things?
So I have seen some production COBOL code at work before and I just looked up some COBOL questions on stackoverflow, e.g.: https://stackoverflow.com/questions/48016044/formatting-date...
Sorry, but even Java's horribly verbose syntax looks way better than this. Apart from the annoying verbosity, I think Java's new time library as well as BigDecimal/BigInteger and also its SQL libraries are not too bad anymore. And if one wants nicer syntax - there are enough languages out there that do it better.
Okay, having SQL directly embeddable is nice, I agree. I don't think it is good programming language design to do that, but then again SQL has been very stable over time and it's certainly nice to use it like that.
> statically-typed structured file I/O
What does that mean? I tried to find it out but didn't really get an idea what you mean by that.
> Plus, COBOL doesn't have reflection and metaprogramming so self-important idiomatic Java code golf is spotted immediately as out of place next to actual business logic.
That's true, but that doesn't make COBOL better - it just makes Java worse. ;)
Even Java is improving more than COBOL and improving fast is certainly not Java's strength...
Turns out, fast release cycles do a lot of good for a language.
In ten or twenty years, LDJ will be obsolete, now you have to convert to LDJ2. Now you have COBOL programmers on staff, plus LDJ programmers, plus LDJ2 programmers. No one wants to be on the obsolete teams, your salary budget bloats, and you have more Babel.
If there's always going to be something better, and you're always going to be fragmented, why not stay with just the one single obsolete technology?
It seems almost impossible for any product to ever fully shed itself of a technology. Primarily because while the initial work may yield benefits that management loves, they never prioritize finishing the job as some feature always takes priority.
And now you have 2 problems. You have half your product in one language/tech, and another dark corner in another tech. Your build is more complex and your product more fragile.
Edit: clarification in parentheses.
If I were going to work for a bank, I'd look at Nubank.
* They have made 2 tech aquihires - Cognitect and Plataformatec
* Engineering culture
I can't imagine how bad it would be to build up 10 years in a COBOL factory just to be let go. Pretty easy to get a new job I suppose. Most big business places I've dealt with don't embrace remote either.
I've never heard of SWIFTNet using FTP at its core, do you have some links I could read?
Wikipedia, for example , while not a fully trusted source, has
> Alliance Access (SAA) and Alliance Messaging Hub (AMH) are the main messaging software applications by SWIFT, which allow message creation for FIN messages, routing and monitoring for FIN and MX messages. The main interfaces are FTA (files transfer automated, not FTP)
We don't talk about COBOL so much as we talk with COBOL. I have email after email with COBOL snippets discussing various projects (and I'm an analyst here, not a programmer). No one mentions COBOL by name, but almost anyone, even nonprogrammers, understand it to a degree that they can reason about what is happening and point out errors in the assumptions made by the programmer.
There are IVRs that talk to mainframes via tn3270 to execute CICS transactions. Most CICS programs are written in COBOL. There are external websites, internal web applications, even VBA macros that rely on tn3270 and CICS, thus COBOL.
While other languages may dominate the business of languages, COBOL dominates the business of business.
Just wondering how true those statements about cobol still being so popular are...
A long time ago I work on the core processing system for a large US insurer, the business would continually call up for confirmation on the business rules from the code. Not a basis for success on complex business laden with tonnes of regulatory oversight.
wow, very good point
This technology is clearly crap and ripe for disruption. Sure: there are some good reasons for things to be the way that they are but decades of compatibility work won't save banks against a competent competitor.