2025-10-14 08:51:00
MAI-Image-1: Microsofts erster eigener Bildgenerator
Der Bildgenerator MAI-Image-1 von Microsoft hat seinen ersten Auftritt in der LMArena gemacht.
https://www.heise.de/news/MAI-Image-1-Mi…
MAI-Image-1: Microsofts erster eigener Bildgenerator
Der Bildgenerator MAI-Image-1 von Microsoft hat seinen ersten Auftritt in der LMArena gemacht.
https://www.heise.de/news/MAI-Image-1-Mi…
AI image generators like Nano Banana have increased realism by mimicking phone camera traits in contrast, exposure, and sharpening to avoid the uncanny valley (Allison Johnson/The Verge)
https://www.theverge.com/column/843883/ai-image-generators-better-worse
🏈 How name, image, and likeness policies boost college football's competitive balance
#sports
Microsoft unveils MAI-Image-1, its first text-to-image AI model developed in house, and says it excels at photorealistic imagery, like lighting and landscapes (Andrew J. Hawkins/The Verge)
https://www.theverge.com/news/798923/microsoft-ai-image-generator-in-house…
Ordinality of Visible-Thermal Image Intensities for Intrinsic Image Decomposition
Zeqing Leo Yuan, Mani Ramanagopal, Aswin C. Sankaranarayanan, Srinivasa G. Narasimhan
https://arxiv.org/abs/2509.10388 …
A silent scenery in four acts...
(Selenium-toned[1] 4x6 salt print (still in final wash) plus previous work-in-progress stages... see alt text for details)
The picture/motif itself is one of my personal favorites and was taken 4 years ago in the Alpstein massif, when clouds were spontaneously forming around us, creating a wonderful light/scene/drama and a dear memory... now also as print which likely will/can outlast myself
[1] The tones will become more neutral once dry...…
TDADL-IE: A Deep Learning-Driven Cryptographic Architecture for Medical Image Security
Junhua Zhou, Quanjun Li, Weixuan Li, Guang Yu, Yihua Shao, Yihang Dong, Mengqian Wang, Zimeng Li, Changwei Gong, Xuhang Chen
https://arxiv.org/abs/2510.11301
Welcome back - been busy the last days with meeting friends etc 🙂
But I wanted to show some more #photos from the #Lotharsteig in the #nationalparkschwarzwald . Different from …
»Britische Regulierer ungehalten — 4chan soll jetzt wirklich Strafe zahlen:
Die britische Regulierungsbehörde Ofcom hat erstmals eine Strafe nach dem neuen Online Safety Act verhängt. Das berüchtigte Image-Forum 4chan soll jetzt wirklich zahlen und sein Verhalten ändern - sonst steigt das Bußgeld täglich weiter an.«
Online-Propaganda wird mal wirklich bestraft und nicht nur darüber gesprochen.
🍀
Efficient Learned Image Compression Through Knowledge Distillation
Fabien Allemand, Attilio Fiandrotti, Sumanta Chaudhuri, Alaa Eddine Mazouz
https://arxiv.org/abs/2509.10366 ht…
Embedding the Teacher: Distilling vLLM Preferences for Scalable Image Retrieval
Eric He, Akash Gupta, Adian Liusie, Vatsal Raina, Piotr Molenda, Shirom Chabra, Vyas Raina
https://arxiv.org/abs/2510.12014
Image of a quantum-corrected black hole without Cauchy horizons illuminated by a static thin accretion disk
Shilong Huang, Jiawei Chen, Jinsong Yang
https://arxiv.org/abs/2510.09956
Advancing credibility and transparency in brain-to-image reconstruction research: Reanalysis of Koide-Majima, Nishimoto, and Majima https://arxiv.org/abs/2511.07960 by @… et al.;
Weekend #Plankton Factoid 🦠🦐
Most cladocerans have a bivalve carapace, but one genus is very different. Leptodora is globally distributed in the north, large (~2 cm), elongated, and effectively transparent, leading it to be coined "ghost flea" by a colleague. It is a primitive genus and is the only cladoceran with a nauplius stage. Highly predaceous with a huge eye, it preys on juv…
JND-Guided Light-Weight Neural Pre-Filter for Perceptual Image Coding
Chenlong He, Zijing Dong, Min Li, Zhijian Hao, Leilei Huang, Xiaoyang Zeng, Yibo Fan
https://arxiv.org/abs/2510.10648
#Blakes7 Series A, Episode 06 - Seek-Locate-Destroy
SERVALAN: Very well, Councillor Bercol. You may tell the President that I am appointing a Space Commander to take absolute control of this matter. He will be exclusively concerned to seek, locate, and destroy Blake.
BERCOL: Oh, excellent, excellent.
The Trump administration is arguing that requiring real-time American Sign Language interpretation of events like White House press briefings
“would severely intrude on the President’s prerogative to control the image he presents to the public,”
part of a lawsuit seeking to require the White House to provide the services.
Department of Justice attorneys haven’t elaborated on how doing so might hamper the portrayal Trump seeks to present to the public.
But overturning p…
White House: ASL Interpreters Hurt Trump's Image - Joe.My.God.
https://www.joemygod.com/2025/12/white-house-asl-interpreters-hurt-trumps-image/
Testing chatbots on the creation of encoders for audio conditioned image generation
Jorge E. Le\'on, Miguel Carrasco
https://arxiv.org/abs/2509.09717 https://
September 2025 war weltweit der drittwärmste seit Beginn der Aufzeichnungen.
Laut dem #Copernicus Climate Change Service lag die globale Durchschnittstemperatur bei 16,11 °C – 0,66 °C über dem Mittelwert von 1991–2020.
In #Europa betrug die Abweichung sogar 1,23 °C. Solche
Hybrid Vision Transformer and Quantum Convolutional Neural Network for Image Classification
Mingzhu Wang, Yun Shang
https://arxiv.org/abs/2510.12291 https://
Image reconstruction with the JWST Interferometer
Max Charles, Louis Desdoigts, Benjamin Pope, Peter Tuthill, Dori Blakely, Doug Johnstone, Shrishmoy Ray, K. E. Saavik Ford, Barry McKernan, Anand Sivaramakrishnan
https://arxiv.org/abs/2510.10924
RE: https://mastodon.social/@fes4ever/115350527094368974
Describe your Mastodon account in a single image
Image detection-based high-throughput sorting of particles using traveling surface acoustic waves in microscale flows
Nikhil Sethia, Joseph Sushil Rao, Amit Manicka, Michael L. Etheridge, Erik B. Finger, John C. Bischof, Cari S. Dutcher
https://arxiv.org/abs/2509.09692
Kicking off the @… annual meeting with some insitu observations with #ECR Shenjie Zhou.
Get the data preprint at the QR code in the image below. Will also go into @soos map
Image reconstruction with the #JWST Interferometer / AMIGO - a Data-Driven Calibration of the JWST Interferometer: https://arxiv.org/abs/2510.10924 / https://arxiv.org/abs/2510.09806 -> How we sharpened the James Webb telescope’s vision from a million kilometres away: https://theconversation.com/how-we-sharpened-the-james-webb-telescopes-vision-from-a-million-kilometres-away-262510 -> thread https://bsky.app/profile/benjaminpope.bsky.social/post/3m34san6dd22m
Immunizing Images from Text to Image Editing via Adversarial Cross-Attention
Matteo Trippodo, Federico Becattini, Lorenzo Seidenari
https://arxiv.org/abs/2509.10359 https://
Anybody know of a Linux image editing tool that lets you dynamically mesh-warp a layer (i.e. click a point in the image and move it to align with a feature on another layer, adding new mesh key points each time you do this)?
Kein Helm und keine Warnweste kann unsere Kinder vor den Verbrechen des rechtsextremen Terrors schützen.
Joseph Karel Vos wurde 11 Jahre, dann ermordeten ihn die Nazis in einer Gaskammer zusammen 666 anderen Menschen.
Die Bundesregierung, geführt von der CDU, ist auf bestem Wege den ideolgischen Nachfahren der Nazis wieder zur Macht zu verhelfen - statt sie zu bekämpfen, übernehmen sie ihre Positionen und führen politischen Machtkampf gegen andere demokratische Parteien
Been playing with what kind of image to use in the fake window. Some beachy sunset as though looking out from a beach bar?
Maybe that's too lazy just based on the window blind I made a couple of years back. Will I sleep paranoid I'm being watched by Krux?
Die @… ist auf Mastodon. Also sie war hier. Denn sie ist auf der Flucht in die BigTech Netzwerke.
Dass sie hier kaum Resonanz bekommen haben, liegt möglicherweise an der Form der Postings: hauptsächlich waren es glatte PR Sprüche mit Sharepics. Kaum echte Interaktion. Ich wundere mich nicht, warum sie es niemals in meine Timline geschafft haben.
Ich w…
How is it that #PalestineAction, whose worst 'crime' is to spray paint aeroplanes, is declared a #Terrorist organisation, and the #IDF, whose soldiers routinely beat and rape prisoners, shoot ch…
When I am King, all graphs will be required to use the label "lots" along the Y-axis.
https://mastodon.gamedev.place/@eniko/115373539503634831
Achievement of the day:
Produce the same output as me_cleaner for a ThinkPad X270 image.
A few hardcoded assumptions and no extra options at this point.
Off to a good start! 🥳✨👩💻
WPP and Google strike a $400M deal to embed Google's AI tools like Veo in WPP marketing and give WPP early access to Google's latest video and image models (Daniel Thomas/Financial Times)
https://www.ft.com/content/ccf5ea33-1398-4b9d-8bf6-a403f2f5493e
…
Time For 9 o'clock #HashTagGames hosted by @…
#ScandalousFoods
Elevating Medical Image Security: A Cryptographic Framework Integrating Hyperchaotic Map and GRU
Weixuan Li, Guang Yu, Quanjun Li, Junhua Zhou, Jiajun Chen, Yihang Dong, Mengqian Wang, Zimeng Li, Changwei Gong, Lin Tang, Xuhang Chen
https://arxiv.org/abs/2510.12084
NOTHING is forever...
Here are tips for how to stop the latest MacOS upgrade now, and if you want, until the end of support comes years later.
Apple is good at supporting old stuff, but they're bound to get worse at it. Everything's changing faster and faster, and nefarious bad guys make security harder and harder.
#Security
Just suspended/defederated the zhub.link domain that appears to be a white supremacist/hate speech instance whose admin is going around harassing Palestinians and allies on the fediverse and is/was apparently also hosting an account disseminating CSAM (https://hear-me.social/@admin/11266673170955…
On The Road - To Xi’An/ Angles ↕️
在路上 - 去西安/ 角度 ↕️
📷 Pentax MX
🎞️Fujifilm Neopan F, expired 1993
#filmphotography #Photography #blackandwhite
Series A, Episode 12 - Deliverance
SERVALAN: Yes?
VOICE: [V.O.] Space Commander Travis is here.
SERVALAN: Send him in. [Travis enters and stands before Servalan's desk. She ignores him for a moment, concentrating on something else.]
https://blake.torpidity.net/m/112/224 B7B4<…
A Lightweight Ensemble-Based Face Image Quality Assessment Method with Correlation-Aware Loss
MohammadAli Hamidi, Hadi Amirpour, Luigi Atzori, Christian Timmerer
https://arxiv.org/abs/2509.10114
Google rolls out Nano Banana image editing upgrades in Photos, including a "help me edit" feature that lets users make edits using text or voice prompts (Elyse Betters Picaro/ZDNET)
https://www.zdnet.com/article/your-google-
Continuing a bit with the #photos from our #schwarzwald #vacations : (unknowingly) we walked along the Western-way. It was pretty cool because the trail was very diverse as you can see on the photo…
SynthID-Image: Image watermarking at internet scale
Sven Gowal, Rudy Bunel, Florian Stimberg, David Stutz, Guillermo Ortiz-Jimenez, Christina Kouridi, Mel Vecerik, Jamie Hayes, Sylvestre-Alvise Rebuffi, Paul Bernard, Chris Gamble, Mikl\'os Z. Horv\'ath, Fabian Kaczmarczyck, Alex Kaskasoli, Aleksandar Petrov, Ilia Shumailov, Meghana Thotakuri, Olivia Wiles, Jessica Yung, Zahra Ahmed, Victor Martin, Simon Rosen, Christopher Sav\v{c}ak, Armin Senoner, Nidhi Vyas, Pushmeet Kohli
Replaced article(s) found for eess.IV. https://arxiv.org/list/eess.IV/new
[1/1]:
- Rethinking Medical Anomaly Detection in Brain MRI: An Image Quality Assessment Perspective
Pan, Xia, Yan, Xu, Qin, Li, Wu, Jia, Chen, Shi
Moody Urbanity - Relations VI 🧬
情绪化城市 - 关系 VI 🧬
📷 Zeiss IKON Super Ikonta 533/16
🎞️ Ilford HP5 400, expired 1993
#filmphotography #Photography #blackandwhite
The HRRR, a NOAA real-time 3-km resolution, hourly updated, cloud-resolving, convection-allowing atmospheric model, initialized by 3km grids with 3km radar assimilation,
predicts a whole lotta water will fall on coastal Southern California in the early morning https://m.ai6yr.org/@ai6yr/11555228330
InfGen: A Resolution-Agnostic Paradigm for Scalable Image Synthesis
Tao Han, Wanghan Xu, Junchao Gong, Xiaoyu Yue, Song Guo, Luping Zhou, Lei Bai
https://arxiv.org/abs/2509.10441
Series B, Episode 08 - Hostage
MUTOID: Yes, Supreme Commander. Time distort ten.
SERVALAN: Mutoids. Don't you ever question anything?
https://blake.torpidity.net/m/208/384 B7B4
Finally an RGB image of comet #3IATLAS from images taken by #GeminiNorth during the 26 November #shadowTheScientists session: https://noirlab.edu/public/news/noirlab2532/ - the coma has become bluer due to more gas emission. Also published today: an X-ray image by XMM-Newton at https://www.esa.int/ESA_Multimedia/Images/2025/12/XMM-Newton_sees_comet_3I_ATLAS_in_X-ray_light with explanations in http://www.cbat.eps.harvard.edu/iau/cbet/005600/CBET005646.txt which also contains more 3I news.
Meanwhile the paper https://iopscience.iop.org/article/10.3847/2515-5172/ae2915 claims an upper limit for 3I's nucleus diameter of only some 750 meters from the measured non-gravitational acceleration (NGA) parameters: that would be waaay more stringent than the 5.6 km upper limit from early Hubble observations reported in https://iopscience.iop.org/article/10.3847/2041-8213/adf8d8 (where also a lower limit of 440 meters is stated).
Da hat wohl jemand das #Gehupe satt:
https://mastodon.social/@BundestagPetitionenNewsbot/115208317518312911
Time For 9 o'clock #HashTagGames hosted by @… Let's play!
#HairStyleABookOrPlay
MonochromeEvolutionaryBioDigitalMultiCellularSymbiosisStruggle
(Selected stills from my infinitely evolving C-SCAPE project, 2022 - made with #MonochromeMonday
Replaced article(s) found for eess.IV. https://arxiv.org/list/eess.IV/new
[1/1]:
- PL-Net: Progressive Learning Network for Medical Image Segmentation
Kunpeng Mao, Ruoyu Li, Junlong Cheng, Danmei Huang, Zhiping Song, ZeKui Liu
Series B, Episode 12 - The Keeper
FOOL: [Singing] I sing you of the Tents, my Master, Charl of the Goths, I sing you of the golden tents, Where your fathers wait to greet you...
BLAKE: [Examining the talisman] It's gone, the brain print's gone!
https://blake.torpidity.net/m/212/515…
DiT360: High-Fidelity Panoramic Image Generation via Hybrid Training
Haoran Feng, Dizhe Zhang, Xiangtai Li, Bo Du, Lu Qi
https://arxiv.org/abs/2510.11712 https://
#FotoVorschlag 'Adventliches' / Advent
My wife was gifted two #Lego sets. And this was one of it. It fits super cool to the season and into our appartement.
Happy third #advent if…
Google adds Gemini 2.5 Flash Image, aka Nano Banana, to Search's AI Mode and Google Lens, on Android in the US for those with an account opted into Search Lab (Abner Li/9to5Google)
https://9to5google.com/2025/10/11/google-lens-ai-mode-nano-banana/
Time For 9 o'clock #HashTagGames hosted by @… Let's play!
#IrritateASitcomCharacter
Series D, Episode 03 - Traitor
SOOLIN: Teleport operating. [Teleport sound. Dayna and Tarrant arrive]
VILA: What are you smirking about? Do you realize we've got half the Federation battle fleet looking for us?
https://blake.torpidity.net/m/403/495 B7B3
Moody Urbanity - UP 🔝
情绪化城市 - 上面 🔝
📷 Nikon FE
🎞️ Ilford HP5 Plus 400, expired 1993
#filmphotography #Photography #blackandwhite
Realism Control One-step Diffusion for Real-World Image Super-Resolution
Zongliang Wu, Siming Zheng, Peng-Tao Jiang, Xin Yuan
https://arxiv.org/abs/2509.10122 https://
Good morning! For this #silentSunday I want to share this #foggy path with you.
I was on my way down and when I entered the fog, temperature really dropped. Actually I didn't want to stop or slow down in order to avoid cooling out.
But at this spot I had to take a quick break for…
Beyond Seeing: Evaluating Multimodal LLMs on Tool-Enabled Image Perception, Transformation, and Reasoning
Xingang Guo, Utkarsh Tyagi, Advait Gosai, Paula Vergara, Ernesto Gabriel Hern\'andez Montoya, Chen Bo Calvin Zhang, Bin Hu, Yunzhong He, Bing Liu, Rakshith Sharma Srinivasa
https://arxiv.org/abs/2510.12712
Series C, Episode 02 - Powerplay
RECEPTIONIST: We have to move on to health reception. You can go together if you wish.
VILA: Oh, thank you.
CALLY: We'd like that, thank you.
RECEPTIONIST: This way.
VILA: Thank you.
https://blake.torpidity.net/m/302/416 B7B5
Google updates NotebookLM's Video Overviews, adding six new Nano Banana-powered visual styles and a new "Brief" format for quick insights (Abner Li/9to5Google)
https://9to5google.com/2025/10/13/notebooklm-video-overviews-styles/
PhySIC: Physically Plausible 3D Human-Scene Interaction and Contact from a Single Image
Pradyumna Yalandur Muralidhar, Yuxuan Xue, Xianghui Xie, Margaret Kostyrko, Gerard Pons-Moll
https://arxiv.org/abs/2510.11649
Series D, Episode 11 - Orbit
SERVALAN: Don't concern yourself, Egrorian. I encourage ambition.
EGRORIAN: I never conspired against you, I swear it! Orac, acknowledge my instructions. [Orac buzzes, but says nothing.] I don't understand.
https://blake.torpidity.net/m/411/363 B7B3…
When I was walking down from the mountains recently I noticed the frost on the trees and branches -- and some single leafs that were still around.
I really liked this extra touch of color in the grey surrounding. And that it's just a little detail on the huge mountain.
#photography #winter
MagicMirror: A Large-Scale Dataset and Benchmark for Fine-Grained Artifacts Assessment in Text-to-Image Generation
Jia Wang, Jie Hu, Xiaoqi Ma, Hanghang Ma, Yanbing Zeng, Xiaoming Wei
https://arxiv.org/abs/2509.10260
Series A, Episode 13 - Orac
CALLY: We should keep on moving, they could be right behind us.
BLAKE: Yes, without weapons we don't stand a chance. Look, you keep going. I'm going to stay here and try and bring the roof down - block them off.
https://blake.torpidity.net/m/113/347 …
UniFusion: Vision-Language Model as Unified Encoder in Image Generation
Kevin Li, Manuel Brack, Sudeep Katakol, Hareesh Ravi, Ajinkya Kale
https://arxiv.org/abs/2510.12789 https…
Series C, Episode 05 - The Harvest of Kairos
INTERCEPTOR LEADER: [V.O.] Interceptor Leader to Control. Request instructions!
SERVALAN: Hold position, Interceptor Leader, and wait for instructions from Assault Leader One. We will attack when the Liberator breaks out of Alpha Sector. Four should do it nicely. Well, Dastor, I think it's time we had a little strategic counsel. Bring this - Jarvik to me.
I have some more #photos from the lake from the previous post 😄
For these I went on another side of the lake where there were quite some heather around. - While my wife was taking a small snack, I walked around and searched for nice foregrounds with which I could play and make an appealing memory
And even though I'd also have taken a small rest, I'm more than happy that I spe…
GAMMA: Generalizable Alignment via Multi-task and Manipulation-Augmented Training for AI-Generated Image Detection
Haozhen Yan, Yan Hong, Suning Lang, Jiahui Zhan, Yikun Ji, Yujie Gao, Jun Lan, Huijia Zhu, Weiqiang Wang, Jianfu Zhang
https://arxiv.org/abs/2509.10250
Series D, Episode 02 - Power
TARRANT: [Disbelieving] Telekinesis?
ORAC: The power to move objects at a distance using only the mind-
DAYNA: [Cutting him off] Yes, we know what it means.
https://blake.torpidity.net/m/402/445 B7B2
Series C, Episode 05 - The Harvest of Kairos
SHAD: [On screen] Yes, madam.
[Launch platform]
GUARD: Instructions from the Captain. You are to wait for the next shuttle.
CARLON: But there isn't another.
GUARD: One will be sent for you tomorrow.
https://blake.torpidity.net/m/305/214
Series B, Episode 04 - Horizon
KOMMISSAR: [O.O.V.] Your companions are still alive.
CALLY: I feel it. You killed him. You killed him because he would not obey you.
KOMMISSAR: What is she talking about?
https://blake.torpidity.net/m/204/416 B7B3
#Blakes7 Series B, Episode 05 - Pressure Point
SERVALAN: If you tell me about the rendezvous, I will consider sparing your life.
KASABI: My life isn't yours to spare.
SERVALAN: Oh, but it is.
https://blake.torpidity.ne…
Efficient Perceptual Image Super Resolution: AIM 2025 Study and Benchmark
Bruno Longarela, Marcos V. Conde, Alvaro Garcia, Radu Timofte
https://arxiv.org/abs/2510.12765 https://…
Series D, Episode 10 - Gold
SOOLIN: [Searching Keiller.] He's clean.
KEILLER: Of course I'm clean, what do you take me for. Avon and I are old friends. It's good to see you, Avon.
DAYNA: Uh, nobody said move, Keiller. Just relax.
https://blake.torpidity.net/m/410/6 B7B3…
Zero-Shot CFC: Fast Real-World Image Denoising based on Cross-Frequency Consistency
Yanlin Jiang, Yuchen Liu, Mingren Liu
https://arxiv.org/abs/2510.12646 https://
Series D, Episode 13 - Blake
BLAKE: Oh, most of it wasn't on Earth, Tarrant. Not what happened to me. [Arlen enters, Blake turns, Tarrant kicks the gun out of Blake's hand, shoves him into Arlen, and bolts out of the room, strongarming Deva on the way.]
ARLEN: Do you want him killed?!
https://blake.torpidity.net/m/413/…
Series B, Episode 01 - Redemption
BLAKE: Zen, identify the hostiles.
CALLY: Forty-two seconds.
BLAKE: Zen! [Zen burbles]
AVON: The information must be bypassing the translator systems.
CALLY: Thirty-five seconds.
https://blake.torpidity.net/m/201/97 B7B3
Benchmarking foundation models for hyperspectral image classification: Application to cereal crop type mapping
Walid Elbarz, Mohamed Bourriz, Hicham Hajji, Hamd Ait Abdelali, Fran\c{c}ois Bourzeix
https://arxiv.org/abs/2510.11576
Series D, Episode 02 - Power
VILA: Not too close. The gap is all connected with the thickness of the door. So what's the difference between Seskas and Hommiks?
PELLA: How do you know how thick it is?
VILA: By the stress pattern.
https://blake.torpidity.net/m/402/132 B7B4
Zero-shot image privacy classification with Vision-Language Models
Alina Elena Baia, Alessio Xompero, Andrea Cavallaro
https://arxiv.org/abs/2510.09253 https://
Series B, Episode 02 - Shadow
BLAKE: There's no reason why they should.
[Space City. Chairman is seen on communications screen]
ENFORCER: They'll be making planetfall anytime now.
CHAIRMAN: Excellent. Are all Largo's addicts so available?
https://blake.torpidity.net/m/202/444
TerraCodec: Compressing Earth Observations
Julen Costa-Watanabe, Isabelle Wittmann, Benedikt Blumenstiel, Konrad Schindler
https://arxiv.org/abs/2510.12670 https://
PET Head Motion Estimation Using Supervised Deep Learning with Attention
Zhuotong Cai, Tianyi Zeng, Jiazhen Zhang, El\'eonore V. Lieffrig, Kathryn Fontaine, Chenyu You, Enette Mae Revilla, James S. Duncan, Jingmin Xin, Yihuan Lu, John A. Onofrey
https://arxiv.org/abs/2510.12758
Detecting Text Manipulation in Images using Vision Language Models
Vidit Vidit, Pavel Korshunov, Amir Mohammadi, Christophe Ecabert, Ketan Kotwal, S\'ebastien Marcel
https://arxiv.org/abs/2509.10278
ViCO: A Training Strategy towards Semantic Aware Dynamic High-Resolution
Long Cui, Weiyun Wang, Jie Shao, Zichen Wen, Gen Luo, Linfeng Zhang, Yanting Zhang, Yu Qiao, Wenhai Wang
https://arxiv.org/abs/2510.12793
Series B, Episode 12 - The Keeper
BLAKE: Travis! Old man, old man, Lurgen, the healer, did he talk of Star One? Did he speak to you of anything?
OLD MAN: [Virtually inaudible] A fool knows everything and nothing. [Dies]
https://blake.torpidity.net/m/212/518 B7B2
SRUM: Fine-Grained Self-Rewarding for Unified Multimodal Models
Weiyang Jin, Yuwei Niu, Jiaqi Liao, Chengqi Duan, Aoxue Li, Shenghua Gao, Xihui Liu
https://arxiv.org/abs/2510.12784