"Legacy data storage devices like floppy disks therefore present serious complications for archivists. "If you've got a book, it doesn't matter how old it is – you can still read it," Talboom says (provided you understand the language it is written in, of course). With floppy disks, however, you need specialised equipment just to access the content itself – it is like requiring a key to open a book. Even then you might not be able to read what is inside. " https://lnkd.in/dMAfTHxQ
Luzia Verdasca Antunes’ Post
More Relevant Posts
-
Following up to my previous post on how LLMs/transformers can be seen as predictors based on a probabilistic model, I would like to also point you to https://lnkd.in/eCe6JMbU https://lnkd.in/eRGpext5 https://lnkd.in/erSnFe_7 https://lnkd.in/es3Rpdbf It is important to understand that using a softmax normalization operation at the last transformation step for data in embedding spaces does not (necessarily) lead to a (sensible) probabilistic model per se. What matters is, for instance, also a representation about priors. In addition, I also would like to point out that the option of using distributions for parameters instead of scalars in the embedding space transformation mappings is another way to understand the notion of a probabilistic model. Given the large number of mapping parameters required for LLMs, using distributions instead of scalars for mapping parameters is demanding due to increased computational effort. However, representations about the uncertainty/trustworthiness of output could be beneficial indeed. See also my previous post https://lnkd.in/ey9RWdrD
We often read that LLMs just compute a series of probability distributions from each of which the respective next word is sampled. Well, in the following publications, some light is shed on what this really means. https://lnkd.in/eunmy4PD https://lnkd.in/ee6fChnv https://lnkd.in/ebFMvVBV https://lnkd.in/ewszi_St See also our view on the topic in https://lnkd.in/em5vp6uR
To view or add a comment, sign in
-
I wrote a thing with Robyn Smith and Larry S. about how LLMs can most effectively be used to extract and structure information from historical documents for research purposes. LLMs work best when paired with proven "old" technology such as OCR and full-text search. Proven tech helps target and focus LLM tasks - decreasing costs and improving accuracy. This ensemble approach also offers significant benefits in flexibility and adaptability versus highly tuned, customized, single purpose ML models at a generally small cost in performance. Check out the paper, which comes out of the Philly Fed's "Center for the REstoration of Economic Data (CREED)" - which I co-founded in 2023. The paper uses Philadelphia deeds as a case study and shows how our approach leads to 98% precision and recall in the mid 90s for the task of finding specific legal language in a corpus of 4.7 million images. https://lnkd.in/ejSXUea3
To view or add a comment, sign in
-
10 Days to Understand XRD Graphs and Data — Let’s Learn Together! 💪 Many researchers struggle to interpret XRD patterns — those sharp peaks, strange angles, and endless phase names can be confusing at first. But once you understand the basics, every graph starts telling a clear story about your material. Here’s a simple 10-day roadmap we can follow together: 📅 Day 1–2: Basics of XRD — principle, Bragg’s law, and setup 📅 Day 3–4: Identifying peaks, Miller indices, and structures 📅 Day 5–6: Phase identification using databases (ICDD, COD) 📅 Day 7–8: Calculating crystallite size, strain, and lattice parameters 📅 Day 9–10: Real data interpretation and practice with research papers Whether you’re new to XRD or want to sharpen your skills, join the challenge and let’s make these 10 days count! #XRD #MaterialsScience #ResearchCommunity #Nanomaterials #ScientificLearning
To view or add a comment, sign in
-
-
For our October newsletter, we checked in with Ed Baring, Professor of History and Human Values, who is co-leading an innovative project with the Center for Digital Humanities that will transform how scholars understand the circulation and interpretation of Marxist ideas. "Citing Marx" aims to track published citations of the Manifest der Kommunistischen Partei (Communist Manifesto) and Das Kapital (vol. 1) within articles of Die Neue Zeit, a German socialist periodical, focusing on volumes published between 1891 and 1918. And with the expertise of the CDH’s humanities research software engineers, computational tools are being developed to do this research, with a goal to build reusable software for future applications. https://lnkd.in/eP-qtJZ7
To view or add a comment, sign in
-
-
🌀 New arXiv alert 🥳 In this work, I have introduced the local perception operator (LPO, typically used in the steepest entropy ascent framework, to define 'local' for implementing entropy maximization) in the context of determining compatibility with local hidden variable (LHV) theory. ⬛ When a given quantum state violates Bell inequalities, we say that it is non local, i.e., no LHV description can be used to reproduce the statistics. ⬛ But what happens when a violation does not occur? Does the state accommodate a LHV description? The answer is not necessarily. There exists a hierarchy of inequalities, of which if any one of them produces the violation then we say it has no LHV description. ⬛ What of the computation of the violation? For a two party case, you need the global state, and the corresponding operators to compute the correlations to compute the violation. I show that 🟩 using LPO based witness, one can use local marginals to answer the above question with certain optimism. 🟩 Barring marginals with maximally mixed status, the local and global witnesses formalized in this paper can detect classicality with a geometry free tight upper bound (for local witness), and a numerical conjectured bound (for global witness). 🟩 I also show (and claim) that this LPO based witness setup can be extended from two party to multi-partite situation as well. Do give it a read! https://lnkd.in/gXPWW5tr
To view or add a comment, sign in
-
This is such a computer vision researcher thing to say! Everything is big data if you’re brave enough. Words in text can become too many too if one wants to handle a complete Dostoyevsky novel as one single data point. But yes absolutely, point taken from the authors here that there are more images in the world than novels. Funnily, while transformers with linearly scaling attention can handle entire novels as one sequence, there are not enough novels written to create significant datasets in this way. Text transformers usually work with smaller sequences and text is usually processed at a sentence level.
To view or add a comment, sign in
-
-
Generational glitch or hidden treasure? Following on from letters from Sir Isaac Newton, notebooks belonging to Charles Darwin, rare Islamic texts, fragments of a sheet from 200BC containing the Ten Commandments written in Hebrew - Cambridge University is now looking to save data locked in floppy disks. You may think that floppy disks, which were popular from the 1970s to the 1990s, would be more resilient than fragile manuscripts but there is a race against time before they degrade to a point where the data they contain can no longer be accessed. And even when the disks are readable it then requires the correct hardware, software and knowledge of how they've been formatted to finally unlock the data. What’s on those disks? They’re not just bits and bytes – they might be drafts of research, personal archives, software artifacts or lost work from people who shaped their fields. The University recently hosted a 'floppy disk workshop' where members of the public could bring old disks they had at home to see what contents are locked inside. This is a reminder that it's not just world-changing history potentially stored on these disks but family history too. So it's not just for specialists. It’s about ensuring future generations can understand the past – even that captured on formats we now consider obsolete.
To view or add a comment, sign in
-
In today’s research blog, Rishik Kolpekwar examines how the Directional Search A* algorithm refines traditional pathfinding by balancing smoothness, accuracy, and computational efficiency. Read here: https://lnkd.in/gSVCpPy7
To view or add a comment, sign in
-