Arabic Hate Speech Identification and Masking in Social Media using Deep Learning Models and Pre-trained Models Fine-tuning
Salam Thabet Doghmash, Motaz Saad
https://arxiv.org/abs/2507.23661
Why AI can't possibly make you more productive; long
#AI and "productivity", some thoughts:
Edit: fixed some typos.
Productivity is a concept that isn't entirely meaningless outside the context of capitalism, but it's a concept that is heavily inflected in a capitalist context. In many uses today it effectively means "how much you can satisfy and/or exceed your boss' expectations." This is not really what it should mean: even in an anarchist utopia, people would care about things like how many shirts they can produce in a week, although in an "I'd like to voluntarily help more people" way rather than an "I need to meet this quota to earn my survival" way. But let's roll with this definition for a second, because it's almost certainly what your boss means when they say "productivity", and understanding that word in a different (even if truer) sense is therefore inherently dangerous.
Accepting "productivity" to mean "satisfying your boss' expectations," I will now claim: the use of generative AI cannot increase your productivity.
Before I dive in, it's imperative to note that the big generative models which most people think of as constituting "AI" today are evil. They are 1: pouring fuel on our burning planet, 2: psychologically strip-mining a class of data laborers who are exploited for their precarity, 3: enclosing, exploiting, and polluting the digital commons, and 4: stealing labor from broad classes of people many of whom are otherwise glad to give that labor away for free provided they get a simple acknowledgement in return. Any of these four "ethical issues" should be enough *alone* to cause everyone to simply not use the technology. These ethical issues are the reason that I do not use generative AI right now, except for in extremely extenuating circumstances. These issues are also convincing for a wide range of people I talk to, from experts to those with no computer science background. So before I launch into a critique of the effectiveness of generative AI, I want to emphasize that such a critique should be entirely unnecessary.
But back to my thesis: generative AI cannot increase your productivity, where "productivity" has been defined as "how much you can satisfy and/or exceed your boss' expectations."
Why? In fact, what the fuck? Every AI booster I've met has claimed the opposite. They've given me personal examples of time saved by using generative AI. Some of them even truly believe this. Sometimes I even believe they saved time without horribly compromising on quality (and often, your boss doesn't care about quality anyways if the lack of quality is hard to measure of doesn't seem likely to impact short-term sales/feedback/revenue). So if generative AI genuinely lets you write more emails in a shorter period of time, or close more tickets, or something else along these lines, how can I say it isn't increasing your ability to meet your boss' expectations?
The problem is simple: your boss' expectations are not a fixed target. Never have been. In virtue of being someone who oversees and pays wages to others under capitalism, your boss' game has always been: pay you less than the worth of your labor, so that they can accumulate profit and thus more capital to remain in charge instead of being forced into working for a wage themselves. Sure, there are layers of management caught in between who aren't fully in this mode, but they are irrelevant to this analysis. It matters not how much you please your manager if your CEO thinks your work is not worth the wages you are being paid. And using AI actively lowers the value of your work relative to your wages.
Why do I say that? It's actually true in several ways. The most obvious: using generative AI lowers the quality of your work, because the work it produces is shot through with errors, and when your job is reduced to proofreading slop, you are bound to tire a bit, relax your diligence, and let some mistakes through. More than you would have if you are actually doing and taking pride in the work. Examples are innumerable and frequent, from journalists to lawyers to programmers, and we laugh at them "haha how stupid to not check whether the books the AI reviewed for you actually existed!" but on a deeper level if we're honest we know we'd eventually make the same mistake ourselves (bonus game: spot the swipe-typing typos I missed in this post; I'm sure there will be some).
But using generative AI also lowers the value of your work in another much more frightening way: in this era of hype, it demonstrates to your boss that you could be replaced by AI. The more you use it, and no matter how much you can see that your human skills are really necessary to correct its mistakes, the more it appears to your boss that they should hire the AI instead of you. Or perhaps retain 10% of the people in roles like yours to manage the AI doing the other 90% of the work. Paradoxically, the *more* you get done in terms of raw output using generative AI, the more it looks to your boss as if there's an opportunity to get enough work done with even fewer expensive humans. Of course, the decision to fire you and lean more heavily into AI isn't really a good one for long-term profits and success, but the modern boss did not get where they are by considering long-term profits. By using AI, you are merely demonstrating your redundancy, and the more you get done with it, the more redundant you seem.
In fact, there's even a third dimension to this: by using generative AI, you're also providing its purveyors with invaluable training data that allows them to make it better at replacing you. It's generally quite shitty right now, but the more use it gets by competent & clever people, the better it can become at the tasks those specific people use it for. Using the currently-popular algorithm family, there are limits to this; I'm not saying it will eventually transcend the mediocrity it's entwined with. But it can absolutely go from underwhelmingly mediocre to almost-reasonably mediocre with the right training data, and data from prompting sessions is both rarer and more useful than the base datasets it's built on.
For all of these reasons, using generative AI in your job is a mistake that will likely lead to your future unemployment. To reiterate, you should already not be using it because it is evil and causes specific and inexcusable harms, but in case like so many you just don't care about those harms, I've just explained to you why for entirely selfish reasons you should not use it.
If you're in a position where your boss is forcing you to use it, my condolences. I suggest leaning into its failures instead of trying to get the most out of it, and as much as possible, showing your boss very clearly how it wastes your time and makes things slower. Also, point out the dangers of legal liability for its mistakes, and make sure your boss is aware of the degree to which any of your AI-eager coworkers are producing low-quality work that harms organizational goals.
Also, if you've read this far and aren't yet of an anarchist mindset, I encourage you to think about the implications of firing 75% of (at least the white-collar) workforce in order to make more profit while fueling the climate crisis and in most cases also propping up dictatorial figureheads in government. When *either* the AI bubble bursts *or* if the techbros get to live out the beginnings of their worker-replacement fantasies, there are going to be an unimaginable number of economically desperate people living in increasingly expensive times. I'm the kind of optimist who thinks that the resulting social crucible, though perhaps through terrible violence, will lead to deep social changes that effectively unseat from power the ultra-rich that continue to drag us all down this destructive path, and I think its worth some thinking now about what you might want the succeeding stable social configuration to look like so you can advocate towards that during points of malleability.
As others have said more eloquently, generative AI *should* be a technology that makes human lives on average easier, and it would be were it developed & controlled by humanists. The only reason that it's not, is that it's developed and controlled by terrible greedy people who use their unfairly hoarded wealth to immiserate the rest of us in order to maintain their dominance. In the long run, for our very survival, we need to depose them, and I look forward to what the term "generative AI" will mean after that finally happens.
Ich finde die deutsche Feiertagsdiskussion irre, solange man nicht Tage, sondern Wochen mit irrsinniger Bürokratie aus dem Hause Schilda verbringt. Das deutsche Steuerrecht ist komplett krank, die Planfeststellungverfahren führen zum Totalstillstand. Im Ahrtal bauen sie wieder in die Flutgebiete weil der Flächentausch nicht gelingt. Aber einen Feiertag abschaffen für den journalistischen Clickbait. #Fail …
LLM coding is the opposite of DRY
An important principle in software engineering is DRY: Don't Repeat Yourself. We recognize that having the same code copied in more than one place is bad for several reasons:
1. It makes the entire codebase harder to read.
2. It increases maintenance burden, since any problems in the duplicated code need to be solved in more than one place.
3. Because it becomes possible for the copies to drift apart if changes to one aren't transferred to the other (maybe the person making the change has forgotten there was a copy) it makes the code more error-prone and harder to debug.
All modern programming languages make it almost entirely unnecessary to repeat code: we can move the repeated code into a "function" or "module" and then reference it from all the different places it's needed. At a larger scale, someone might write an open-source "library" of such functions or modules and instead of re-implementing that functionality ourselves, we can use their code, with an acknowledgement. Using another person's library this way is complicated, because now you're dependent on them: if they stop maintaining it or introduce bugs, you've inherited a problem, but still, you could always copy their project and maintain your own version, and it would be not much more work than if you had implemented stuff yourself from the start. It's a little more complicated than this, but the basic principle holds, and it's a foundational one for software development in general and the open-source movement in particular. The network of "citations" as open-source software builds on other open-source software and people contribute patches to each others' projects is a lot of what makes the movement into a community, and it can lead to collaborations that drive further development. So the DRY principle is important at both small and large scales.
Unfortunately, the current crop of hyped-up LLM coding systems from the big players are antithetical to DRY at all scales:
- At the library scale, they train on open source software but then (with some unknown frequency) replicate parts of it line-for-line *without* any citation [1]. The person who was using the LLM has no way of knowing that this happened, or even any way to check for it. In theory the LLM company could build a system for this, but it's not likely to be profitable unless the courts actually start punishing these license violations, which doesn't seem likely based on results so far and the difficulty of finding out that the violations are happening. By creating these copies (and also mash-ups, along with lots of less-problematic stuff), the LLM users (enabled and encouraged by the LLM-peddlers) are directly undermining the DRY principle. If we see what the big AI companies claim to want, which is a massive shift towards machine-authored code, DRY at the library scale will effectively be dead, with each new project simply re-implementing the functionality it needs instead of every using a library. This might seem to have some upside, since dependency hell is a thing, but the downside in terms of comprehensibility and therefore maintainability, correctness, and security will be massive. The eventual lack of new high-quality DRY-respecting code to train the models on will only make this problem worse.
- At the module & function level, AI is probably prone to re-writing rather than re-using the functions or needs, especially with a workflow where a human prompts it for many independent completions. This part I don't have direct evidence for, since I don't use LLM coding models myself except in very specific circumstances because it's not generally ethical to do so. I do know that when it tries to call existing functions, it often guesses incorrectly about the parameters they need, which I'm sure is a headache and source of bugs for the vibe coders out there. An AI could be designed to take more context into account and use existing lookup tools to get accurate function signatures and use them when generating function calls, but even though that would probably significantly improve output quality, I suspect it's the kind of thing that would be seen as too-baroque and thus not a priority. Would love to hear I'm wrong about any of this, but I suspect the consequences are that any medium-or-larger sized codebase written with LLM tools will have significant bloat from duplicate functionality, and will have places where better use of existing libraries would have made the code simpler. At a fundamental level, a principle like DRY is not something that current LLM training techniques are able to learn, and while they can imitate it from their training sets to some degree when asked for large amounts of code, when prompted for many smaller chunks, they're asymptotically likely to violate it.
I think this is an important critique in part because it cuts against the argument that "LLMs are the modern compliers, if you reject them you're just like the people who wanted to keep hand-writing assembly code, and you'll be just as obsolete." Compilers actually represented a great win for abstraction, encapsulation, and DRY in general, and they supported and are integral to open source development, whereas LLMs are set to do the opposite.
[1] to see what this looks like in action in prose, see the example on page 30 of the NYTimes copyright complaint against OpenAI (#AI #GenAI #LLMs #VibeCoding
sp_hospital: Hospital ward dynamic contacts (2010)
This dataset contains the temporal network of contacts between patients, patients and health-care workers (HCWs) and among HCWs in a hospital ward in Lyon, France, from Monday, December 6, 2010 at 1:00 pm to Friday, December 10, 2010 at 2:00 pm. The study included 46 HCWs and 29 patients.
This network has 75 nodes and 32424 edges.
Tags: Social, Offline, Unweighted, Temporal, Metadata
Beyond Architectures: Evaluating the Role of Contextual Embeddings in Detecting Bipolar Disorder on Social Media
Khalid Hasan, Jamil Saquer
https://arxiv.org/abs/2507.14231
sp_hospital: Hospital ward dynamic contacts (2010)
This dataset contains the temporal network of contacts between patients, patients and health-care workers (HCWs) and among HCWs in a hospital ward in Lyon, France, from Monday, December 6, 2010 at 1:00 pm to Friday, December 10, 2010 at 2:00 pm. The study included 46 HCWs and 29 patients.
This network has 75 nodes and 32424 edges.
Tags: Social, Offline, Unweighted, Temporal, Metadata
Overly academic/distanced ethical discussions
Had a weird interaction with @/brainwane@social.coop just now. I misinterpreted one of their posts quoting someone else and I think the combination of that plus an interaction pattern where I'd assume their stance on something and respond critically to that ended up with me getting blocked. I don't have hard feelings exactly, and this post is only partly about this particular person, but I noticed something interesting by the end of the conversation that had been bothering me. They repeatedly criticized me for assuming what their position was, but never actually stated their position. They didn't say: "I'm bothered you assumed my position was X, it's actually Y." They just said "I'm bothered you assumed my position was X, please don't assume my position!" I get that it's annoying to have people respond to a straw man version of your argument, but when I in response asked some direct questions about what their position was, they gave some non-answers and then blocked me. It's entirely possible it's a coincidence, and they just happened to run out of patience on that iteration, but it makes me take their critique of my interactions a bit less seriously. I suspect that they just didn't want to hear what I was saying, while at the same time they wanted to feel as if they were someone who values public critique and open discussion of tricky issues (if anyone reading this post also followed our interaction and has a different opinion of my behavior, I'd be glad to hear it; it's possible In effectively being an asshole here and it would be useful to hear that if so).
In any case, the fact that at the end of the entire discussion, I'm realizing I still don't actually know their position on whether they think the AI use case in question is worthwhile feels odd. They praised the system on several occasions, albeit noting some drawbacks while doing so. They said that the system was possibly changing their anti-AI stance, but then got mad at me for assuming this meant that they thought this use-case was justified. Maybe they just haven't made up their mind yet but didn't want to say that?
Interestingly, in one of their own blog posts that got linked in the discussion, they discuss a different AI system, and despite listing a bunch of concrete harms, conclude that it's okay to use it. That's fine; I don't think *every* use of AI is wrong on balance, but what bothered me was that their post dismissed a number of real ethical issues by saying essentially "I haven't seen calls for a boycott over this issue, so it's not a reason to stop use." That's an extremely socially conformist version of ethics that doesn't sit well with me. The discussion also ended up linking this post: https://chelseatroy.com/2024/08/28/does-ai-benefit-the-world/ which bothered me in a related way. In it, Troy describes classroom teaching techniques for introducing and helping students explore the ethics of AI, and they seem mostly great. They avoid prescribing any particular correct stance, which is important when teaching given the power relationship, and they help students understand the limitations of their perspectives regarding global impacts, which is great. But the overall conclusion of the post is that "nobody is qualified to really judge global impacts, so we should focus on ways to improve outcomes instead of trying to judge them." This bothers me because we actually do have a responsibility to make decisive ethical judgments despite limitations of our perspectives. If we never commit to any ethical judgment against a technology because we think our perspective is too limited to know the true impacts (which I'll concede it invariably is) then we'll have to accept every technology without objection, limiting ourselves to trying to improve their impacts without opposing them. Given who currently controls most of the resources that go into exploration for new technologies, this stance is too permissive. Perhaps if our objection to a technology was absolute and instantly effective, I'd buy the argument that objecting without a deep global view of the long-term risks is dangerous. As things stand, I think that objecting to the development/use of certain technologies in certain contexts is necessary, and although there's a lot of uncertainly, I expect strongly enough that the overall outcomes of objection will be positive that I think it's a good thing to do.
The deeper point here I guess is that this kind of "things are too complicated, let's have a nuanced discussion where we don't come to any conclusions because we see a lot of unknowns along with definite harms" really bothers me.
sp_hospital: Hospital ward dynamic contacts (2010)
This dataset contains the temporal network of contacts between patients, patients and health-care workers (HCWs) and among HCWs in a hospital ward in Lyon, France, from Monday, December 6, 2010 at 1:00 pm to Friday, December 10, 2010 at 2:00 pm. The study included 46 HCWs and 29 patients.
This network has 75 nodes and 32424 edges.
Tags: Social, Offline, Unweighted, Temporal, Metadata