~500k loc python project here. Detest it. For being such an old and established ...

hbrn · on Oct 25, 2022

I never really understood this distinction between large and small projects.

Why not invest into good boundaries and turn your large project into a group of small projects?

500k LOC project should have plenty of natural boundaries. A team should recognize and draw those, regardless of the language being used.

I recently worked at ~200k LOC Django project: the code was far from perfect, and yet I had no trouble onboarding new team members and making them productive. Here's an isolated 20k LOC domain, you'll grasp it in a week, you'll ship on your first day and then almost every day afterwards, and eventually your knowledge will extend to other areas. Isn't that how every big project should be managed?

Sure, things like strong typing do make the monolithic ball of mud more maintainable. But how about not building big ball of mud in the first place?

matsemann · on Oct 26, 2022

A better language would enforce those boundaries. In python you can't, not without making it a completely new app. And when you first is in the situation, it's virtually impossible to separate compared to for instance java.

It's always easy to say "just be better and more diligent programmers", but that doesn't work. If the language promote spagetti, spagetti will be written.

hbrn · on Oct 26, 2022

> It's always easy to say "just be better and more diligent programmers", but that doesn't work. If the language promote spagetti, spagetti will be written.

Oh, I completely agree and I would never say that.

But at the same time, Java promotes complexity and overengineering. I've seen 10+ nested classes for something that was a 5-line function in Python.

The big difference for me is when I talk to Python engineer they agree that their spaghetti sucks. They want to evolve out of it, they just haven't found a way yet.

It is much harder to convince Java folks that their class hierarchies are useless.

Fixing Python spaghetti is way easier than fixing Java folks mindset.

za3faran · on Oct 30, 2022

> But at the same time, Java promotes complexity and overengineering

Nothing in Java inherently does that. It's actually improved quite dramatically since Java 8, with many features like records, pattern matching, lambdas, SAMs, etc.

mark_l_watson · on Oct 25, 2022

I agree with you that using Python for very large projects might not be the best choice. I love programming in Common Lisp, and there are similar issues as Python.

For huge projects, I still think that Java is a good choice, and although I have only professionally worked on one Haskell project (medium size), I think that Haskell might be good if a team is in place who can use it. A new friend of mine in town is enthusiastic about OCaml, and after a few evenings of studying, I wish that about 8 years ago when I started Haskell I had chosen OCaml for a production typed language.

For Python: I really like Python for deep learning, reinforcement learning, quick and small semantic web apps, etc. The common thread here is that I am not writing much Python myself, instead I am exploiting large well tested libraries.

nequo · on Oct 25, 2022

> I wish that about 8 years ago when I started Haskell I had chosen OCaml for a production typed language.

Do you mind writing a bit more about why? I have been a curious bystander in OCaml land but some of the differences with Haskell, like the lack of type classes, have pushed me toward the latter.

Qem · on Oct 25, 2022

What are the strengths of OCaml when compared with Haskell?

jmt_ · on Oct 25, 2022

I feel very similar but have struggled to set aside enough time to find a better replacement. For work I often build one-off scripts, web scrapers/automaters, data tools, and backend web apps/APIs. While I don't disagree with your comments about the ecosystem, I find myself very dependent on it to do the aforementioned work (playwright + beautifulsoup, peewee/native sqlite3 lib, numpy + scikit, Flask/Django) and is probably the main reason I've continued using it. Does anyone have recommendations for some directions I could research? Go and/or Rust seem to be clear contenders but I'm not sure the ecosystem has equivalents or at least mature-enough equivalents for the libraries I use. Very open to learning about other languages too but simply am out of the loop. Something with a great type system and some reasonable flexibility would be amazing (eg I like that I can mix classes and functions in modules easily in Python compared to say old-school Java where everything is a class). I'm also not looking for a language that's primarily functional at this time, too much to learn right now on top of a new language, but it's on my long term to-do list.

matsemann · on Oct 26, 2022

Kotlin ticks a lot of your boxes. Can mix classes and functions. Very easy to write functional code, but also easy to not do it when needed.

emptysea · on Oct 25, 2022

I think the issues you encountered may be due to the specific libraries. Lots of the pre-typing libraries haven’t adopted static typing, like Django and Celery and then when your project is 95% Django and Celery you’re SOL.

I’m not even sure it’s possible to have Django typed without reworking the ORM, I’m thinking about reverse relations, .annotate(), etc.

Yes, there are type stubs for these libraries but they’re either forced to be more strict, preventing use of dynamism, or opt for being less strict but allowing you to use all the library features, at the cost of safety.

I think in the end, new libraries built with static typing in mind, like Pydantic, FastAPI, and Edgedb, are the answer.

henbruas · on Oct 25, 2022

> Yes, there are type stubs for these libraries but they’re either forced to be more strict, preventing use of dynamism, or opt for being less strict but allowing you to use all the library features, at the cost of safety.

There are type stubs for Django that somewhat avoid these compromises: https://github.com/typeddjango/django-stubs

To be able to do this they have to use a Mypy plugin though. And even then it's still far from perfect.

matsemann · on Oct 26, 2022

And Pycharm and django-stubs can't work together for some reason.

tpict · on Oct 25, 2022

I don’t disagree with the “early TypeScript” comparison, but what’s the issue with args, *kwargs?

matsemann · on Oct 25, 2022

The problem is that you lose all help from tooling/IDEs. Like in Celery, the definition is "shared_task(*args, *kwargs)". This gives you no indication of what parameters you actually can use. Opening up the code doesn't help, as it's many layers down. The decorated function ends up untyped, but with some new methods on it that again are untyped. But like originalfunction.delay(...) should have the params of the original decorated function. But no, all that is lost. Just pray that the docs are correct.*

henbruas · on Oct 25, 2022

While it's of course not ideal, stub files can help with this issue. For example you can get stubs for Celery that make both `shared_task` and `delay` properly typed: https://github.com/sbdchd/celery-types

matsemann · on Oct 26, 2022

Only since 3.10, though, before that it was actually impossible to type the decorators correctly.

dragonwriter · on Oct 26, 2022

> Only since 3.10, though, before that it was actually impossible to type the decorators correctly.

There were type gymnastics you could do using overloads to reach any arbitrary level of coverage, but it was ugly and always short of fully general.

Too · on Oct 26, 2022

ParamSpec from 3.10 is adding some improvements to type hinting decorators and wrapper functions that just forward args and kwargs.

https://peps.python.org/pep-0612/