Tootfinder

Opt-in global Mastodon full text search. Join the index!

@arXiv_csDL_bot@mastoxiv.page
2025-06-24 08:30:50

Unfolding the Past: A Comprehensive Deep Learning Approach to Analyzing Incunabula Pages
Klaudia Ropel, Krzysztof Kutt, Luiz do Valle Miranda, Grzegorz J. Nalepa
arxiv.org/abs/2506.18069

@whitequark@mastodon.social
2025-09-21 18:58:18

$ git soda pop
update: git-man-page-generator.lokalto

Every major US city has a private foundation supporting police,
with more than 250 nationwide, according to a 2021 report by research and activist groups Little Sis and Color of Change.
The foundations have been used to pay for surveillance technologies in cities like Baltimore and Los Angeles -- without being subject to public scrutiny, according to the report.
More than a year after a digital news outlet and a research group sued the Atlanta Police Foundation for alleged…

@arXiv_csCR_bot@mastoxiv.page
2025-09-17 10:05:30

Characterizing Phishing Pages by JavaScript Capabilities
Aleksandr Nahapetyan, Kanv Khare, Kevin Schwarz, Bradley Reaves, Alexandros Kapravelos
arxiv.org/abs/2509.13186

@arXiv_csAI_bot@mastoxiv.page
2025-08-20 09:29:30

TASER: Table Agents for Schema-guided Extraction and Recommendation
Nicole Cho, Kirsty Fielding, William Watson, Sumitra Ganesh, Manuela Veloso
arxiv.org/abs/2508.13404

@burger_jaap@mastodon.social
2025-07-15 06:06:06

One of the many interesting pages in the Future Energy Scenarios 2025, published today by NESO (🇬🇧 National Energy System Operator).
It focuses on the why, how and what of demand side flexibility.
Key: reward consumers, build trust, automate and smart tariffs.
EV smart charging gateway to more residential flex.
#EV

screenshot of report page: harnessing demand side flexibility to benefit both consumers and the system
Demand flexibility reduces both peak electricity demand and the need for supply side infrastructure.

Rewarding consumers for their high levels of engagement in demand flex has the potential to halve peak electricity demand
@arXiv_csHC_bot@mastoxiv.page
2025-07-14 07:58:52

A Versatile Dataset of Mouse and Eye Movements on Search Engine Results Pages
Kayhan Latifzadeh, Jacek Gwizdka, Luis A. Leiva
arxiv.org/abs/2507.08003

@thesaigoneer@social.linux.pizza
2025-09-10 12:33:47

My install instructions for KDE 6 Wayland on FreeBSD have been updated.
First of all I changed the base to KDE and KDE base apps, instead of just a very minimal install. This follows the recommended install by KDE more closely.
Second: the browser of choice has been changed to Vivaldi and I have included a how-to-update the script and make Vivaldi run under FreeBSD.
Which you can apply anywhere on FreeBSD of course ;-)
Thanks to all for the positive feedback!

@aral@mastodon.ar.al
2025-06-27 06:09:55

You can play with (a supercharged server-driven version of it) today with Kitten:
kitten.small-web.org/tutorials

@kurtsh@mastodon.social
2025-09-07 18:11:48

Reminder that 97% of the 33,000 pages of the Epstein files supposedly released were previously available. The remaining 3% were giant redacted/blacked out pages.
Nothing new was released:
WHERE ARE THE REST OF THE EPSTEIN FILES?
#trump #epstein

@arXiv_csCL_bot@mastoxiv.page
2025-09-15 09:56:51

Beyond Token Limits: Assessing Language Model Performance on Long Text Classification
Mikl\'os Seb\H{o}k, Viktor Kov\'acs, Martin B\'an\'oczy, Daniel M{\o}ller Eriksen, Nathalie Neptune, Philippe Roussille
arxiv.org/abs/2509.10199

@deprogrammaticaipsum@mas.to
2025-07-09 08:41:54

"When we say monumental, you had better believe it; the 500 pages of this volume, laid out with astonishing detail (and a very small font size) summarize the history and evolution of computers from 1945 to 1990. Throughout these pages, Waldrop reveals that the backbone, the axis, the arrow, the orientation, the mastermind of all that history was none other than Lick himself: he was the incarnation of the phrase “being at the right place at the right time”."

@netzschleuder@social.skewed.de
2025-07-30 13:00:08

stanford_web: Webgraph (Stanford)
The web graph of Stanford University (stanford.edu), as collected in 2002. Nodes represent pages and directed edges represent hyperlinks between them.
This network has 281904 nodes and 2312497 edges.
Tags: Informational, Web graph, Unweighted
networks.skewed.de/net/s…

stanford_web: Webgraph (Stanford). 281904 nodes, 2312497 edges. https://networks.skewed.de/net/stanford_web
@daniel@social.telemetrydeck.com
2025-08-06 10:30:40

Today I'm fixing about a gazillion little issues with our website, mostly for SEO reasons:
- All pages should now have a semantically relevant H1 Tag
- Found the last few page titles that didn't make sense and fixed them
- Our logo had no ALT attribute
- The font in the Survey Chart Tooltips was wrong
- Generally survey charts had wayyy to many decimals
- OpenGraph tags are weird but whatever
Keeping technical debt down is like brushing your teeth…

@tiotasram@kolektiva.social
2025-07-06 12:45:11

So I've found my answer after maybe ~30 minutes of effort. First stop was the first search result on Startpage (millennialhawk.com/does-poop-h), which has some evidence of maybe-AI authorship but which is better than a lot of slop. It actually has real links & cites research, so I'll start by looking at the sources.
It claims near the top that poop contains 4.91 kcal per gram (note: 1 kcal = 1 Calorie = 1000 calories, which fact I could find/do trust despite the slop in that search). Now obviously, without a range or mention of an average, this isn't the whole picture, but maybe it's an average to start from? However, the citation link is to a study (pubmed.ncbi.nlm.nih.gov/322359) which only included 27 people with impaired glucose tolerance and obesity. Might have the cited stat, but it's definitely not a broadly representative one if this is the source. The public abstract does not include the stat cited, and I don't want to pay for the article. I happen to be affiliated with a university library, so I could see if I have access that way, but it's a pain to do and not worth it for this study that I know is too specific. Also most people wouldn't have access that way.
Side note: this doing-the-research protect has the nice benefit of letting you see lots of cool stuff you wouldn't have otherwise. The abstract of this study is pretty cool and I learned a bit about gut microbiome changes from just reading the abstract.
My next move was to look among citations in this article to see if I could find something about calorie content of poop specifically. Luckily the article page had indicators for which citations were free to access. I ended up reading/skimming 2 more articles (a few more interesting facts about gut microbiomes were learned) before finding this article whose introduction has what I'm looking for: pmc.ncbi.nlm.nih.gov/articles/
Here's the relevant paragraph:
"""
The alteration of the energy-balance equation, which is defined by the equilibrium of energy intake and energy expenditure (1–5), leads to weight gain. One less-extensively-studied component of the energy-balance equation is energy loss in stools and urine. Previous studies of healthy adults showed that ≈5% of ingested calories were lost in stools and urine (6). Individuals who consume high-fiber diets exhibit a higher fecal energy loss than individuals who consume low-fiber diets with an equivalent energy content (7, 8). Webb and Annis (9) studied stool energy loss in 4 lean and 4 obese individuals and showed a tendency to lower the fecal energy excretion in obese compared with lean study participants.
"""
And there's a good-enough answer if we do some math, along with links to more in-depth reading if we want them. A Mayo clinic calorie calculator suggests about 2250 Calories per day for me to maintain my weight, I think there's probably a lot of variation in that number, but 5% of that would be very roughly 100 Calories lost in poop per day, so maybe an extremely rough estimate for a range of humans might be 50-200 Calories per day. Interestingly, one of the AI slop pages I found asserted (without citation) 100-200 Calories per day, which kinda checks out. I had no way to trust that number though, and as we saw with the provenance of the 4.91 kcal/gram, it might not be good provenance.
To double-check, I visited this link from the paragraph above: sciencedirect.com/science/arti
It's only a 6-person study, but just the abstract has numbers: ~250 kcal/day pooped on a low-fiber diet vs. ~400 kcal/day pooped on a high-fiber diet. That's with intakes of ~2100 and ~2350 kcal respectively, which is close to the number from which I estimated 100 kcal above, so maybe the first estimate from just the 5% number was a bit low.
Glad those numbers were in the abstract, since the full text is paywalled... It's possible this study was also done on some atypical patient group...
Just to come full circle, let's look at that 4.91 kcal/gram number again. A search suggests 14-16 ounces of poop per day is typical, with at least two sources around 14 ounces, or ~400 grams. (AI slop was strong here too, with one including a completely made up table of "studies" that was summarized as 100-200 grams/day). If we believe 400 grams/day of poop, then 4.91 kcal/gram would be almost 2000 kcal/day, which is very clearly ludicrous! So that number was likely some unrelated statistic regurgitated by the AI. I found that number in at least 3 of the slop pages I waded through in my initial search.

We are faced with psychopaths at the helm of our nation's public health, scientific and healthcare efforts.
They want to inflict pain and suffering.
It's a goal, not a by-product
skywriter.blue/pages/did:plc:b

@lilmikesf@c.im
2025-08-16 19:39:00

#NPR reports of #Drumpf admin operatives leaving 8 pages of sensitive internal eyes only 👀 #Putin feting itinerary notes behind in #Alaska hotel sparks ridi…

"Papers with U.S. State Department
markings, found Friday morning in the
business center of an Alaskan hotel,
revealed previously undisclosed and
potentially sensitive details about the
Aug. 15 meetings between Trump and
Putin in Anchorage.

Eight pages, that appear to have been
produced by U.S. staff and left behind
accidentally, shared precise locations and
meeting times of the summit and phone
numbers of U.S. government employees."
Le Monde headlines from  Paris
* Three guests at Anchorage's Hotel
Captain Cook discovered eight
pages of documents left in a public
printer detailing President Donald
Trump and Vladimir Putin's August 15
summit near Joint Base
ElImendorf-Richardson.

* The documents originated from U.S.
staff and revealed meeting times,
precise room names, phone numbers
of officials, and a planned American
bald eagle desk statue gift for Putin.

* Pages described the summit's
planned lunch in honor of Putin,
featuring a three-course meal wi…
RUSSIAN FEDERATION
Bilateral Program with President Trump
H.E. Vladimir PUTIN
President of the Russian Federation
Joint Base Elmendorf-Richardson
Friday, August 15, 2025
11:30 AM
SEQUENCE:
Arctic Warriors Event Center
2:2 Meeting (Billy Mitchell Room)
P: Pool Spray at Top 12:10 PM
Expanded Meeting and Working 12:15 PM
Lunch (Conference Room) 2:45 PM
P: Officials Onl
POTUS Press Conference Prep 2:45 PM
Billy Mitchell Room 3:25 PM
Joint Press Conference (Susitna 3:30 PM
Room 4:30 PM
POTUS Bids Fa…
@zachleat@zachleat.com
2025-08-29 16:59:37

Looks like the above was last updated in November 2024 but links to the HTTP Archive Tech Report for newer data:

@arXiv_csIR_bot@mastoxiv.page
2025-06-30 08:47:30

SERP Interference Network and Its Applications in Search Advertising
Purak Jain, Sandeep Appala
arxiv.org/abs/2506.21598

@gwire@mastodon.social
2025-07-30 08:18:29

YouTube has dropped its "Trending" pages, in favour of AI recommendations. So now, if YouTube pushes far-right incitement, it's a side-effect of a non-deterministic process and not software humans designed.

@arXiv_csSE_bot@mastoxiv.page
2025-07-29 08:12:31

AccessGuru: Leveraging LLMs to Detect and Correct Web Accessibility Violations in HTML Code
Nadeen Fathallah, Daniel Hern\'andez, Steffen Staab
arxiv.org/abs/2507.19549

@newsie@darktundra.xyz
2025-07-30 15:36:08

Journalist Discovers Google Vulnerability That Allowed People to Disappear Specific Pages From Search 404media.co/journalist-discove

@brian_gettler@mas.to
2025-09-01 12:51:45

No, #LaborDay is not simply Americans refusing to be like everyone else and celebrate May 1st. If you're looking for something on the holiday's #history in #Canada, Heron and Penfold's "The Workers' Festival&…

@gedankenstuecke@scholar.social
2025-08-30 15:58:41

I made a repository on #codeberg to document my own #FreshRSS settings to get fulltext feeds of different websites: #RSS

@stiefkind@mastodon.social
2025-08-31 09:30:33

Somebody built a searchable and browsable BYTE magazine archive with all single pages on one single web page. I think, this is a very nice way of visualising a magazine archive. Enjoy: #vintagecomputing

@paulbusch@mstdn.ca
2025-07-01 17:24:13

And yes, some of you might know that this greenhouse (with 2 locations) is owned by the Ferragine family and they got a significant amount of free publicity via Frank Ferragine who was a host on the morning show for #CityTV in Toronto.
shop.bradfordgreenhouses.com/p

@grahamperrin@bsd.cafe
2025-08-30 22:44:51

@… hi, the repo-specific upgrade at <codeberg.org/thesaigoneer/page

@netzschleuder@social.skewed.de
2025-07-30 02:00:03

marvel_partnerships: Marvel character partnerships (2018)
A network of partnerships among characters in the Marvel comic book universe. Nodes are either heroes or villains, and edges represent partnerships between such characters. The partnership network was extracted from Wikipedia pages of these characters, which indicate partnership relations with other such pages.
This network has 350 nodes and 346 edges.
Tags: Social, Fictional, Unweighted

marvel_partnerships: Marvel character partnerships (2018). 350 nodes, 346 edges. https://networks.skewed.de/net/marvel_partnerships
@losttourist@social.chatty.monster
2025-08-25 18:29:02

I'm reading the latest issue of Classic Pop Magazine (yes, an actual magazine made of actual paper and with actual pages & stuff) and in the contents I see this image.
Well, you know me. There is literally only one thing I can say right now:
KEYTAR KLAXON!
#keytar #totp #80sMusic

@rperezrosario@mastodon.social
2025-06-25 06:30:00

Bart Wullems of The Art of Simplicity shares his experience using Microsoft's .NET Source Browser to take a peek at a particular API's internals. The code documentation pages have links to the corresponding Github repositories where you can see exactly how any given function is implemented.
"Browse the .NET code base with the .NET Source Browser"

AI startup Perplexity is crawling and scraping content from websites that have explicitly indicated they don’t want to be scraped, according to internet infrastructure provider Cloudflare.
On Monday, Cloudflare published research saying it observed the AI startup ignore blocks and hide its crawling and scraping activities.
The network infrastructure giant accused Perplexity of obscuring its identity when trying to scrape web pages “in an attempt to circumvent the website’s prefe…

@arXiv_csAI_bot@mastoxiv.page
2025-09-05 07:30:20

PG-Agent: An Agent Powered by Page Graph
Weizhi Chen, Ziwei Wang, Leyang Yang, Sheng Zhou, Xiaoxuan Tang, Jiajun Bu, Yong Li, Wei Jiang
arxiv.org/abs/2509.03536

@gedankenstuecke@scholar.social
2025-08-27 22:16:29

I've been at my little "link blog" for my website long enough now it was time to figure out how to make it paginate: #Jekyll, there didn't seem to be a default way to create paginated lists that are based on `_data` files. So I did the minimum viable edits to the jekyll-paginate gem to make it work – which it does reasonably okay.
Now I wonder if it's worth to actually still make a gem out of it?
codeberg.org/gedankenstuecke/p

@arXiv_csDL_bot@mastoxiv.page
2025-08-05 08:00:30

The Attribution Crisis in LLM Search Results
Ilan Strauss, Jangho Yang, Tim O'Reilly, Sruly Rosenblat, Isobel Moure
arxiv.org/abs/2508.00838

@netzschleuder@social.skewed.de
2025-07-30 22:00:04

wiki_science: Wikipedia Map of Science (2020)
A network of scientific fields, extracted from the English Wikipedia in early 2020. Nodes are wikipedia pages representing natural, formal, social and applied sciences, and two nodes are linked if the cosine similarity of the page content is above a threshold. See <s…

wiki_science: Wikipedia Map of Science (2020). 687 nodes, 6523 edges. https://networks.skewed.de/net/wiki_science

McCarthyism was rate limited
by the number of people engaged,
the amount of money available,
the legal frameworks in place,
and the media environment,
in a way that is very different from what universities will face
not just in the next three years
but until we find a way to contain these people

@smurthys@hachyderm.io
2025-06-30 17:20:19

It is awesome that #WordPress posts can syndicate to the #Fediverse but the syndication can add some really long posts to the Fedi timeline, like 21-pages-if-you-print long. Not to mention formatting issues.
Ain't nobody reading that on Fedi. 🙂‍
E.g.: #UX #technology

@arXiv_csIR_bot@mastoxiv.page
2025-07-30 08:56:22

Page image classification for content-specific data processing
Kateryna Lutsai, Pavel Stra\v{n}\'ak
arxiv.org/abs/2507.21114 arxiv.org/…

@netzschleuder@social.skewed.de
2025-07-30 15:00:07

wiki_users: Wikipedia user interaction (2011)
A network derived from interactions between editors of the English language Wikipedia, as derived from the edit histories of 563 wiki pages related to politics. A positive sign indicates positive links such as trust or similarities, and a negative sign indicates distrust or disagreement.
This network has 138592 nodes and 740397 edges.
Tags: Social, Online, Signed

wiki_users: Wikipedia user interaction (2011). 138592 nodes, 740397 edges. https://networks.skewed.de/net/wiki_users
@alwynispat@mastodon.sg
2025-06-30 00:14:45

Back from a blog hiatus! ✍️
Wrote about why I picked Cloudflare Pages over Netlify — DNS, Workers, and full-stack control.
#Cloudflare #Netlify #DevOps #WebDev

@unchartedworlds@scicomm.xyz
2025-07-05 10:35:31

Bob Vylan, Palestine Action etc - analysis from Archie Bland in Guardian
"It isn’t just that people are angry that the catastrophe in Gaza isn’t being given due attention: it is that their encounters with observable reality are being flatly denied. ...
"Those people have been told that Gaza protests are hate marches; they can see it’s not true. They have been told that US campus protesters are largely motivated by antisemitism; they can see it’s not true. They have been told that Palestine Action is a terrorist organisation because it spray painted military aircraft; they can see it’s not true. They have been repeatedly told, by Benjamin Netanyahu, that opposition to Israel’s war is antisemitic; they can see it’s not true. They have been told that the British government finds Israel’s actions “intolerable”; they can see it’s not true.
"Now they are being told that opposing the IDF is antisemitic, that the Glastonbury crowd is more virulent than the one at Nuremberg, and that direct action is a form of terrorism. They can see all that’s not true, either, and however far their view is from the front pages, they know that they are far from alone."
#BobVylan #PalestineAction #media #bias #Palestine #Gaza #Israel

@netzschleuder@social.skewed.de
2025-07-30 23:00:04

wiki_science: Wikipedia Map of Science (2020)
A network of scientific fields, extracted from the English Wikipedia in early 2020. Nodes are wikipedia pages representing natural, formal, social and applied sciences, and two nodes are linked if the cosine similarity of the page content is above a threshold. See <s…

wiki_science: Wikipedia Map of Science (2020). 687 nodes, 6523 edges. https://networks.skewed.de/net/wiki_science
@netzschleuder@social.skewed.de
2025-06-24 17:00:08

stanford_web: Webgraph (Stanford)
The web graph of Stanford University (stanford.edu), as collected in 2002. Nodes represent pages and directed edges represent hyperlinks between them.
This network has 281904 nodes and 2312497 edges.
Tags: Informational, Web graph, Unweighted
networks.skewed.de/net/s…

stanford_web: Webgraph (Stanford). 281904 nodes, 2312497 edges. https://networks.skewed.de/net/stanford_web