Tootfinder

Opt-in global Mastodon full text search. Join the index!

No exact results. Similar results found.
@azonenberg@ioc.exchange
2025-07-28 03:09:12

Initial curve25519 accelerator refactoring update: added new MULT_AREA_OPT parameter.
0 = existing implementation (best for 7 series)
1 = resource sharing between constant and variable multipliers, but still does 32 16x32 multiplies per clock. Much slower on 7 series due to DSP cascade, but hits the same 71 MHz Fmax on Trion.
Fabric usage on Trion is slightly higher (9528 -> 10613 LE) but multiplier block usage is down from 96 to 66.
Next step will be trying to fig…

@keen456@infosec.exchange
2025-06-23 14:58:48

Is there a way to get hardware accelerated #VP9 support on #Skylake era #intel CPUs? Best I can tell, there's an abandoned Intel-hybrid driver with partial support that Intel doesn't want to bring…

@arXiv_csDL_bot@mastoxiv.page
2025-06-26 09:01:00

The role of preprints in open science: Accelerating knowledge transfer from science to technology
Zhiqi Wang, Yue Chen, Chun Yang
arxiv.org/abs/2506.20225

@arXiv_physicscompph_bot@mastoxiv.page
2025-07-25 09:07:52

A causality inspired acceleration method for the fast temporal superposition of the finite line source solutions
Marc Basquens, Alberto Lazzarotto
arxiv.org/abs/2507.18200

@azonenberg@ioc.exchange
2025-06-20 06:10:21

Tonight's quick ngscopeclient dev fix: GPU accelerating the demo scope.
It now runs quite a few times faster than before (and is faster to process subsequent filter blocks on, since the input data is now GPU resident).
Not super critical but a nice quality-of-life fix since I use the demo scope as a data source for development pretty often.

ngscopeclient displaying a sinewave, some digital waveforms, and an eye pattern
@metacurity@infosec.exchange
2025-07-08 06:53:09

techcrunch.com/2025/07/07/open
OpenAI tightens the screws on security to keep away prying eyes

@arXiv_eessSY_bot@mastoxiv.page
2025-06-24 11:59:10

A detailed simulation model for fifth generation district heating and cooling networks with seasonal latent storage evaluated on field data
Manuel Kollmar, Adrian B\"urger, Markus Bohlayer, Angelika Altmann-Dieses, Marco Braun, Moritz Diehl
arxiv.org/abs/2506.18528

@azonenberg@ioc.exchange
2025-07-22 08:37:15

Exploring Trion more as a potential FPGA option for future embedded projects.
The first thing that's hitting me hard is the lack of LUTRAM mode. Anything too small to put in a BRAM, or that you need combinatorial reads on, ends up being synthesized as DFFs.
This causes massive size explosion (and worse performance) for some of my stuff, most notably the curve25519 accelerator which balloons from 7632 LUT / 5737 FF / 32 DSP on Kintex-7 to 21625 LUT / 11528 FF / 96 MULT when sy…

@azonenberg@ioc.exchange
2025-06-14 14:46:52

Starting to think about GPU acceleration of protocol decodes (rather than just basic math blocks) in ngscopeclient.
Here's the filter graph I'm thinking of using as a benchmark: dual lane QSGMII, 20M points per channel.
The filter graph takes 944 ms end to end to run on my box (2x Xeon 6144 2080 Ti).
Major time consumers:
* Eye pattern (~345 ms)
* QSGMII (~337 ms)
* CDR PLL (~310 ms)
* 8B10B (~160 ms)
* SGMII (~120 ms)
Note that the …

ngscopeclient filter graph screenshot showing two QSGMII decodes + eye patterns
@azonenberg@ioc.exchange
2025-06-16 06:56:15

Turns out the reason that the GPU accelerated level detector made the CDR slower is that I was working on test data loaded from a file, and not moving the data from CPU to GPU before starting the filter graph. So of *course* it was slower, due to the unnecessary data movement.
But that's unrealistic since on a real high-performance scope driver (e.g. thunderscope) the first thing we typically do is push data from CPU to GPU to do the int-float conversion there.
Time to repea…