Streaming Sequence-to-Sequence Learning with Delayed Streams Modeling
Neil Zeghidour, Eugene Kharitonov, Manu Orsini, V\'aclav Volhejn, Gabriel de Marmiesse, Edouard Grave, Patrick P\'erez, Laurent Mazar\'e, Alexandre D\'efossez
https://arxiv.org/abs/2509.08753
gnutella: Gnutella p2p networks (2002)
A sequence of 9 snapshots of the Gnutella peer-to-peer file sharing network from 5-31 August 2002. Nodes are hosts in the Gnutella network topology and edges are connections between them.
This network has 26518 nodes and 65369 edges.
Tags: Technological, Peer-to-peer, Unweighted
https://
Doctor during annual physical, in a sequence of questions: “Any nausea?”
Me: “Just one day last week… when I read too much news.”
Doctor laughs and professional poise falls apart.
A nice human moment.
👨⚕️😆
Integrating Rules and Semantics for LLM-Based C-to-Rust Translation
Feng Luo, Kexing Ji, Cuiyun Gao, Shuzheng Gao, Jia Feng, Kui Liu, Xin Xia, Michael R. Lyu
https://arxiv.org/abs/2508.06926
Watching the Once Upon a Time... intro hits me with how tiny and temporary my life is compared to the giant sweep of history.
The scenes of people through different ages make me wonder what any of us really mean in the big picture.
The vast flow of time shown reminds me that anarchism is the way forward to break free from repeating cycles of power.
#Anarchism
Properties of cluster red-sequence spiral galaxies
Wayne A. Barkhouse (University of North Dakota), Lane M. Kashur (Colorado State University), Moreom Akter (University of North Dakota), Sandanuwan P. Kalawila (University of North Dakota), Gihan L. Gamage (New Mexico State University-Alamogordo), Omar L\'opez-Cruz (Instituto Nacional de Astrofisica, Optica y Electronica)
CCF: A Context Compression Framework for Efficient Long-Sequence Language Modeling
Wenhao Li, Bangcheng Sun, Weihao Ye, Tianyi Zhang, Daohai Yu, Fei Chao, Rongrong Ji
https://arxiv.org/abs/2509.09199
DualTrack: Sensorless 3D Ultrasound needs Local and Global Context
Paul F. R. Wilson, Matteo Ronchetti, R\"udiger G\"obl, Viktoria Markova, Sebastian Rosenzweig, Raphael Prevost, Parvin Mousavi, Oliver Zettinig
https://arxiv.org/abs/2509.09530
The trouble with Louis Armstrong's first movie is, well, there are two troubles, one is that it was widely considered dull, but beyond that while widely praised for the music, it is apparently missing.
I'm thinking, if anyone can find it, Fediverse can.
It did play Canada, including #Toronto in January 1931 and
In-Context Learning as Nonparametric Conditional Probability Estimation: Risk Bounds and Optimality
Chenrui Liu, Falong Tan, Chuanlong Xie, Yicheng Zeng, Lixing Zhu
https://arxiv.org/abs/2508.08673
malaria_genes: Malaria var DBLa HVR networks
Networks of recombinant antigen genes from the human malaria parasite P. falciparum. Each of the 9 networks shares the same set of vertices but has different edges, corresponding to the 9 highly variable regions (HVRs) in the DBLa domain of the var protein. Nodes are var genes, and two genes are connected if they share a substring whose length is statistically significant. Metadata includes two types of node labels, both based on sequence st…
Scots, as a language, is rich in insults; and an ancient one is 'toom (empty) tabard' for a worthless person -- an epithet most notably applied to John de Balliol, who served as King of Scots during the middle period of the wars of independence.
It's interesting to see Grauniad cartoonist Ben Jenkins apply the idea to #Starmer's cabinet.
Mitigating Multi-Sequence 3D Prostate MRI Data Scarcity through Domain Adaptation using Locally-Trained Latent Diffusion Models for Prostate Cancer Detection
Emerson P. Grabke, Babak Taati, Masoom A. Haider
https://arxiv.org/abs/2507.06384
Monitoring Decomposition Attacks in LLMs with Lightweight Sequential Monitors
Chen Yueh-Han, Nitish Joshi, Yulin Chen, Maksym Andriushchenko, Rico Angell, He He
https://arxiv.org/abs/2506.10949
Luminaries in the Sky: The TESS Legacy Sample of Bright Stars. I. Asteroseismic detections in naked-eye main-sequence and sub-giant solar-like oscillators
Mikkel N. Lund, Ashley Chontos, Frank Grundahl, Savita Mathur, Rafael A. Garc\'ia, Daniel Huber, Derek Buzasi, Timothy R. Bedding, Marc Hon, Yaguang Li
https://arxiv.org/abs/2508.086…
A near-linear time approximation scheme for $(k,\ell)$-median clustering under discrete Fr\'echet distance
Anne Driemel, Jan H\"ockendorff, Ioannis Psarros, Christian Sohler
https://arxiv.org/abs/2508.07008
Toward Precise Curve Offsetting Constrained to Parametric Surfaces
Jin Zhao, Pengfei Wang, Shuangmin Chen, Jiong Guo, Shiqing Xin, Changhe Tu, Wenping Wang
https://arxiv.org/abs/2509.09333
Orbits and Masses for 156 Companions from Combined Astrometry and Radial Velocities, and A Validation of Gaia Non-Single Star Solutions
Qier An, Timothy D. Brandt, G. Mirek Brandt, Alexander Venner
https://arxiv.org/abs/2508.08374
malaria_genes: Malaria var DBLa HVR networks
Networks of recombinant antigen genes from the human malaria parasite P. falciparum. Each of the 9 networks shares the same set of vertices but has different edges, corresponding to the 9 highly variable regions (HVRs) in the DBLa domain of the var protein. Nodes are var genes, and two genes are connected if they share a substring whose length is statistically significant. Metadata includes two types of node labels, both based on sequence st…
Funny green effects during a particularly clear #sunset in Bochum, Germany, on 9 July: the Sun disappeared behind a horizontal cloud bank about 7 arc minutes above the true horizon, the image sequence spans 3 seconds.
When will the FBI release the documents that prove Stanley Kubrick is a time traveler who came forward in time to 2023 for movie ideas and after seeing the Barbie movie decided to make a whole movie based on the opening sequence.
COSMOS-Web galaxy groups: Evolution of red sequence and quiescent galaxy fraction
Greta Toni, Matteo Maturi, Gianluca Castignani, Lauro Moscardini, Ghassem Gozaliasl, Alexis Finoguenov, Sina Taamoli, B. Hollis Akins, C. Rafael Arango-Toro, M. Caitlin Casey, E. Nicole Drakos, L. Andreas Faisst, Carter Flayhart, Maximilien Franco, Fabrizio Gentile, Ali Hadi, Santosh Harish, Hossein Hatamnia, Olivier Ilbert, Shuowen Jin, S. Jeyhan Kartaltepe, Ali Ahmad Khostovan, M. Anton Koekemoer, Gavin…
Today I found out that ffmpeg can process an input image sequence not only using the printf-style pattern (i.e. frame.d.jpg) but also using a glob pattern. It needs the parameter "-pattern_type glob" before your -i argument.
The advantage of glob patterns is the fact that it will skip over missing numbers. You can use this to turn a bunch of numbered photos from your location scouting into a movie clip that you can show to the client or upload as a preview to an asset libr…
C3PO V: Comoving Pairs Indicate Rotational Spin-Down Drives the Main-Sequence Li-Dip
Qinghui Sun (Lucy), Yuan-Sen Ting (Lucy), Barbara J. Anthony-Twarog (Lucy), Bruce A. Twarog (Lucy), Fan Liu (Lucy), Yuxi (Lucy), Lu
https://arxiv.org/abs/2508.08671
LLM4ES: Learning User Embeddings from Event Sequences via Large Language Models
Aleksei Shestov, Omar Zoloev, Maksim Makarenko, Mikhail Orlov, Egor Fadeev, Ivan Kireev, Andrey Savchenko
https://arxiv.org/abs/2508.05688
Totally normal night:
1. Scratching the cat.
2. The first continuous sleep, ending with a #nightmare. I've dreamt that my first return train was delayed, and I'm likely to be stuck in Głogów. On top of that, the train looked looked like a nightmare contraption, with a sequence of cubic compartments with no seats, and weirdly shaped holes in the walls instead of doors.
3. Toilet.
4. Checking blood sugar, just in case.
5. Looking for a spot in the bed that's free of sweat.
6. Scratching the cat.
7. A series of nightmares. The most interesting one was about getting up in the morning, in middle of a gale. The mains voltage was so low I couldn't turn the lights on, and instead of getting myself ready to go out, I was trying hard to measure it, helping myself with a flashlight.
My working theory is that these are side effects of my current medication. One week to go.
ASASSN-24fw: Candidate circumplanetary disk occultation of a main-sequence star
Nadia L. Zakamska, Gautham Adamane Pallathadka, Dmitry Bizyaev, Jaroslav Merc, James E. Owen, Kevin C. Schlaufman, Karolina B\k{a}kowska, S{\l}awomir Bednarz, Krzysztof Bernacki, Agnieszka Gurgul, Kirsten R. Hall, Franz-Josef Hambsch, Krzysztof Kotysz, Sebastian Kurowski, Alexios Liakos, Przemys{\l}aw J. Miko{\l}ajczyk, Erika Pak\v{s}tien\.e, Grzegorz Pojma\'nski, Adam Popowicz, Henrique Reggiani, Danie…
malaria_genes: Malaria var DBLa HVR networks
Networks of recombinant antigen genes from the human malaria parasite P. falciparum. Each of the 9 networks shares the same set of vertices but has different edges, corresponding to the 9 highly variable regions (HVRs) in the DBLa domain of the var protein. Nodes are var genes, and two genes are connected if they share a substring whose length is statistically significant. Metadata includes two types of node labels, both based on sequence st…
The Hubble Arp Galaxy Survey
Julianne J. Dalcanton (Center for Computational Astrophysics, Flatiron Institute, Department of Astronomy, University of Washington), Meredith J. Durbin (Department of Astronomy, University of Washington, Department of Astronomy, University of California, Berkeley), Benjamin F. Williams (Department of Astronomy, University of Washington)
Recursive Aperture Decoded Ultrasound Imaging (READI) With Estimated Motion-Compensated Compounding (EMC2)
Tyler Keith Henry, Darren Dahunsi, Randy Palamar, Negar Majidi, Mohammad Rahim Sobhani, Roger Zemp
https://arxiv.org/abs/2509.08781
malaria_genes: Malaria var DBLa HVR networks
Networks of recombinant antigen genes from the human malaria parasite P. falciparum. Each of the 9 networks shares the same set of vertices but has different edges, corresponding to the 9 highly variable regions (HVRs) in the DBLa domain of the var protein. Nodes are var genes, and two genes are connected if they share a substring whose length is statistically significant. Metadata includes two types of node labels, both based on sequence st…
Spatial-Temporal-Spectral Mamba with Sparse Deformable Token Sequence for Enhanced MODIS Time Series Classification
Zack Dewis, Zhengsen Xu, Yimin Zhu, Motasem Alkayid, Mabel Heffring, Lincoln Linlin Xu
https://arxiv.org/abs/2508.02839
Decoder-Hybrid-Decoder Architecture for Efficient Reasoning with Long Generation
Liliang Ren, Congcong Chen, Haoran Xu, Young Jin Kim, Adam Atkinson, Zheng Zhan, Jiankai Sun, Baolin Peng, Liyuan Liu, Shuohang Wang, Hao Cheng, Jianfeng Gao, Weizhu Chen, Yelong Shen
https://arxiv.org/abs/2507.06607…
Solar magnetic flux rope eruptions caused by inverse flux feeding processes
Quanhao Zhang, Shangbin Yang, Rui Liu, Min Zhang, Dong Wang, Ake Zhao, Shaoyu Lyu, Anchuan Song, Yuming Wang
https://arxiv.org/abs/2508.08766
malaria_genes: Malaria var DBLa HVR networks
Networks of recombinant antigen genes from the human malaria parasite P. falciparum. Each of the 9 networks shares the same set of vertices but has different edges, corresponding to the 9 highly variable regions (HVRs) in the DBLa domain of the var protein. Nodes are var genes, and two genes are connected if they share a substring whose length is statistically significant. Metadata includes two types of node labels, both based on sequence st…
Dynamic Chunking for End-to-End Hierarchical Sequence Modeling
Sukjun Hwang, Brandon Wang, Albert Gu
https://arxiv.org/abs/2507.07955 https://arxiv.org/pdf/2507.07955 https://arxiv.org/html/2507.07955
arXiv:2507.07955v1 Announce Type: new
Abstract: Despite incredible progress in language models (LMs) in recent years, largely resulting from moving away from specialized models designed for specific tasks to general models based on powerful architectures (e.g. the Transformer) that learn everything from raw data, pre-processing steps such as tokenization remain a barrier to true end-to-end foundation models. We introduce a collection of new techniques that enable a dynamic chunking mechanism which automatically learns content -- and context -- dependent segmentation strategies learned jointly with the rest of the model. Incorporating this into an explicit hierarchical network (H-Net) allows replacing the (implicitly hierarchical) tokenization-LM-detokenization pipeline with a single model learned fully end-to-end. When compute- and data- matched, an H-Net with one stage of hierarchy operating at the byte level outperforms a strong Transformer language model operating over BPE tokens. Iterating the hierarchy to multiple stages further increases its performance by modeling multiple levels of abstraction, demonstrating significantly better scaling with data and matching a token-based Transformer of twice its size. H-Nets pretrained on English show significantly increased character-level robustness, and qualitatively learn meaningful data-dependent chunking strategies without any heuristics or explicit supervision. Finally, the H-Net's improvement over tokenized pipelines is further increased in languages and modalities with weaker tokenization heuristics, such as Chinese and code, or DNA sequences (nearly 4x improvement in data efficiency over baselines), showing the potential of true end-to-end models that learn and scale better from unprocessed data.
toXiv_bot_toot
Testing the Role of Merging Binaries in the Formation of the Split Main Sequence in Young Clusters
Nate Bastian, Sebastian Kamann, Florian Niederhofer, Sara Saracino
https://arxiv.org/abs/2509.07708
gnutella: Gnutella p2p networks (2002)
A sequence of 9 snapshots of the Gnutella peer-to-peer file sharing network from 5-31 August 2002. Nodes are hosts in the Gnutella network topology and edges are connections between them.
This network has 6301 nodes and 20777 edges.
Tags: Technological, Peer-to-peer, Unweighted
https://
A Systematic Search for Main-Sequence Dipper Stars Using the Zwicky Transient Facility
Anastasios Tzanidakis, James R. A. Davenport, Neven Caplar, Eric C. Bellm, Wilson Beebe, Doug Branton, Sandro Campos, Andrew J. Connolly, Melissa DeLucchi, Konstantin Malanchev, Sean McGuire
https://arxiv.org/abs/2508.03964…
caida_as: CAIDA AS graphs (2004-2007)
A sequence of 122 network snapshots denoting Autonomous System (AS) relationships on the Internet, from 2004-2007, inferred using the Serial-1 method from RouteViews BGP table snapshots and a set of heuristics.
This network has 25696 nodes and 105332 edges.
Tags: Technological, Communication, Unweighted, Temporal
3D hydrodynamic simulations of massive main-sequence stars -- IV. Internal gravity waves matter for SLF variability
Praneet Pathak, Simon Blouin, Falk Herwig, Paul R. Woodward
https://arxiv.org/abs/2508.03893
Probing accretion and stellar properties in the Orion Nebula with VLT/X-Shooter
L. Piscarreta, G. Beccari, R. A. B. Claes, C. F. Manara, H. M. J. Boffin, T. Jerabkova, B. Ercolano, A. Natta, S. E. van Terwisga
https://arxiv.org/abs/2509.08784
Spectroscopic ages for 4 million main-sequence dwarf stars from LAMOST DR10 estimated with data-driven approach
Jia-Hui Wang, Maosheng Xiang, Meng Zhang, Jiwei Xie, Jian Ge, Jinghua Zhang, Lanya Mou, Jifeng Liu
https://arxiv.org/abs/2508.03019