Reliability, evaluation, and “hallucination anxiety” are where most AI programmes quietly stall. Not because the model is weak. Because the system around it is not built to scale trust. When companies move beyond demos, three hard questions appear: →Can we rely on this output? →Do we know what “good” actually looks like? →How much human oversight is enough? The fix is not better prompting. It is a strategy and operating discipline. 𝐅𝐢𝐫𝐬𝐭: Define reliability like a product, not a vibe. Every serious AI use case should have a one-page SLO sheet with measurable targets across: →Task success ↳Right-first-time rate and rubric-based acceptance →Factual grounding ↳Evidence coverage and unsupported-claim tracking →Safety and compliance ↳Policy violations and PII leakage →Operational quality ↳Latency, cost per task, escalation to humans Now “good” is no longer opinion. It is observable. 𝐒𝐞𝐜𝐨𝐧𝐝: evaluation must be continuous, not a one-off demo test. Use a simple loop: 𝐏lan: Define rubrics, datasets, and risk tiers 𝐃o: Run offline evaluations and limited pilots 𝐂heck: Monitor drift and regressions weekly 𝐀ct: Update prompts, data, guardrails, and workflows Support this with an AI test pyramid: →Unit checks for prompts and tool behaviour →Scenario tests for real edge failures →Regression benchmarks to prevent backsliding →Live monitoring in production Add statistical control charts, and you can detect silent degradation before users do. 𝐓𝐡𝐢𝐫𝐝: reduce hallucinations by design. →Run a short failure-mode workshop and engineer controls: →Require retrieval or evidence before answering →Allow safe abstention instead of confident guessing →Add claim checking and tool validation →Use structured intake and clarifying flows You are not asking the model to behave. You are designing a system that expects failure and contains it. 𝐅𝐨𝐮𝐫𝐭𝐡: make human-in-the-loop affordable. Tier risk: →Low risk: Light sampling →Medium risk: Triggered review →High risk: Mandatory approval Escalate only when signals demand it: low confidence, missing evidence, policy flags, or novelty spikes. Review becomes targeted, fast, and a source of improvement data. 𝐅𝐢𝐧𝐚𝐥𝐥𝐲: Operate it like a capability. Track outcomes, risk, delivery speed, and cost on a single dashboard. Hold a short weekly reliability stand-up focused on regressions, failure modes, and ownership. What you end up with is simple: ↳Use case catalogue with risk tiers ↳Clear SLOs and error budgets ↳Continuous evaluation harness ↳Built-in controls ↳Targeted human review ↳Reliability cadence AI does not scale on intelligence alone. It scales on measurable trust. ♻️ Share if you found thisuseful. ➕ Follow (Jyothish Nair) for reflections on AI, change, and human-centred AI #AI #AIReliability #TrustAtScale #OperationalExcellence
AI in Journalism
Explore top LinkedIn content from expert professionals.
-
-
The first news organisations that adopted AI all did so for a business reason and this could be why actual #newsroom adoption has been low or inconsistent. Most editors look at #AI with suspicion... as something imposed on them and not something they can trust or use to their advantage. Of course, AI can create content, but a newsroom is the last place it should do so. News organisations should protect their status as creators of credible primary knowledge and not outsource that job to machines. The smartest newsrooms use AI as a research assistant, data gleaner, and distribution agent. AI tools can also be used to translate with accuracy, summarise with a level of audience-based customisation, generate representational images with full disclosure, create video elements and audio where footage or clips are not available, as well as for marketing mailers, social posts, visualisations… but always with a human in the loop. If used wisely, AI can be a great force multiplier for news organisations, giving them an edge and speed that keeps them ahead in a competitive landscape. But using AI to create #content is like buying your death on a quick commerce site.
-
Another thing I read recently: The AI and Journalism Research Working Group (of which my colleague Amy A. Ross Arguedas and I are a part), convened by the Center for News, Technology & Innovation has synthesised findings from 55 research studies across computer science, linguistics, and social science to evaluate how AI transcription and translation tools are currently impacting journalism. We found that AI use for transcription and translation is very common in news work and has arguably made a big difference to many journalists’ work, because these tools offer significant time savings compared to manual processes. For example the Houston Chronicle uses AI to summarise local public government meetings, A European Perspective at the EBU is enabling content exchange across 10 broadcasters, and Dubawa uses AI to help fact-check radio broadcasts in Ghana and Nigeria At the same time, we still see inequalities in the performance of systems based on the language spoken. Most tools are optimised for high-resource languages, primarily English, with significant performance gaps for low-resource languages (those with less available online textual data) which include languages spoken by hundreds of millions of people. Then there are issues depending on the accent and dialect of speakers, let alone when it comes sign language. Nuance can get lost as AI translation still often focuses on literal (referential) meaning while missing social functions (the indexical meaning) and cultural context (so e.g. ‘street food’ translated as ‘food of the road’), and bias can creep in, too. A good overview for anyone who wants a better understanding of the pros and cons and what can be – and should be – done to address them. Source: https://lnkd.in/eqdWbduS
-
⭐ 𝗥𝗲𝘀𝗽𝗼𝗻𝘀𝗶𝗯𝗹𝗲 𝗔𝗜 & 𝗛𝘂𝗺𝗮𝗻 𝗢𝘃𝗲𝗿𝘀𝗶𝗴𝗵𝘁. As #AI systems become more capable, the real challenge is no longer technical - it’s governance. The next MIT xPRO course module highlighted something increasingly clear across many sectors: the more powerful the system, the more important the safeguards. Three ideas stood out: 🔍 𝟭. 𝗧𝗿𝘂𝘀𝘁𝘄𝗼𝗿𝘁𝗵𝘆 𝗔𝗜 𝘀𝘁𝗮𝗿𝘁𝘀 𝘄𝗶𝘁𝗵 𝘁𝗿𝗮𝗻𝘀𝗽𝗮𝗿𝗲𝗻𝗰𝘆 Even strong models generate uncertainty, including hallucinations or overconfident outputs. Building trust requires: ➡️ clear sourcing ➡️ confidence indicators ➡️ human verification workflows ➡️ explicit boundaries around “where #AI stops.” 🧭 𝟮. 𝗛𝘂𝗺𝗮𝗻 𝗼𝘃𝗲𝗿𝘀𝗶𝗴𝗵𝘁 𝗿𝗲𝗺𝗮𝗶𝗻𝘀 𝗻𝗼𝗻-𝗻𝗲𝗴𝗼𝘁𝗶𝗮𝗯𝗹𝗲 In sectors where decisions affect safety, compliance, or public trust, humans must retain the final word. #AI can accelerate insight generation, but it should not be allowed to create new procedures, reinterpret standards, or override domain expertise. 🛡 𝟯. 𝗚𝗼𝘃𝗲𝗿𝗻𝗮𝗻𝗰𝗲 𝗺𝗮𝘁𝘁𝗲𝗿𝘀 𝗮𝘀 𝗺𝘂𝗰𝗵 𝗮𝘀 𝗰𝗮𝗽𝗮𝗯𝗶𝗹𝗶𝘁𝘆 Policies, audit trails, escalation processes, and clearly defined roles are just as important as model accuracy. Successful #AI integration depends on institutional readiness - not just technical sophistication. 🌍 𝗔𝗽𝗽𝗹𝘆𝗶𝗻𝗴 𝘁𝗵𝗶𝘀 𝘁𝗼 𝗵𝘂𝗺𝗮𝗻𝗶𝘁𝗮𝗿𝗶𝗮𝗻 𝗺𝗶𝗻𝗲 𝗮𝗰𝘁𝗶𝗼𝗻 at the Geneva International Centre for Humanitarian Demining (GICHD) Our sector handles complex, safety-critical knowledge. #AI can help practitioners navigate information more efficiently, but only when paired with robust guardrails and transparent design. The goal is not automation - it’s augmenting analysis, strengthening learning, and ensuring decisions remain grounded in validated guidance. Responsible #AI is not a constraint. It’s the foundation that makes meaningful innovation possible. Previous post is here: https://lnkd.in/deNZbHgN #AIAdoption #DigitalTransformation #HumanitarianTech #ResponsibleAI
-
From Print-First to AI-First Without Losing the Soul. For decades, newsrooms were structured in silos, print over here, digital over there, social media somewhere in the corner. But the AI shift is forcing integration. Some Indian publishers are showing how it’s done: AI pre-writes predictable content — election results, cricket scorecards, budget outlines. A “multi-version desk” produces platform-ready content for print, web, and social at the same time. 100% AI-led experimental brands explore risky formats without affecting the main brand. It’s not about replacing editorial judgment — it’s about removing inefficiencies. The future newsroom will be fluid, with content created for multiple platforms from the very first draft. What’s exciting is that AI isn’t just helping speed things up — it’s allowing entirely new formats of storytelling to emerge. Interactive graphics, AI-assisted local language coverage, and on-demand explainers are just the start. Biggest takeaway: The future newsroom won’t be “print” or “digital” — it’ll be fluid, where stories are platform-ready from the start. *Part of a series based on sessions from a recent Google News Initiative conference, distilling key ideas, case studies, and takeaways for those who couldn’t attend. Follow Kumar Manish for the next post in the series #AI #Newsroom #MediaTransformation #Storytelling #DigitalFirst #GoogleNewsInitiative #JournalismInnovation
-
BBC is rolling out two AI tools for news production, with a blueprint worth watching: human-supervised summarization and local news scaling. 1️⃣ Summarization has been implemented by various newsrooms already, but BBC chose to keep journalists in the loop. Interesting approach to the scale vs human-in-the-loop equation. Journalists use a single approved prompt, then review & edit everything before publication. They stay in control of the final output. 2️⃣ "BBC Style Assist" - this one's fascinating. The BBC gets hundreds of stories daily from the Local Democracy Reporting Service, but reformatting them into BBC house style takes time. So they built their own LLM that's "read" thousands of BBC articles to learn the house style. The process: LDRS story comes in → Style Assist reformats it → Senior BBC journalist reviews → Gets published. Nothing goes live without human oversight. What I really like about this approach: - They're being transparent about AI use - Journalists remain in editorial control - They're solving real production bottlenecks - Built their own model rather than just using off-the-shelf tools - Starting with limited pilots in Wales & east England This feels like the right way to integrate AI into journalism - augmenting human capabilities rather than replacing them, with clear guardrails & transparency. Curious: Would you trust an article more if you knew AI was used to summarize or reformat it—as long as a human editor checked the final version? h/t Olle Zachrison for the link (in comments)
-
I tested how well some AI tools actually work for journalism. Here's what I found, published today in the Columbia Journalism Review. 🤖🗞️ 👉 https://lnkd.in/eU4DVJsa As a reporter, I’ve often asked myself: Can I actually trust AI tools to support real journalism work? For me, and for folks like Hugging Face’s Florent Daudens, The Washington Post's Jeremy B. Merrill, and Sahan Journal's Cynthia Tu—“vibe checks” aren’t enough. I teamed up with a group of amazing researchers to run structured tests on some of the most popular AI tools. We used real-world editorial tasks, summarizing government meetings and reviewing scientific research, to see how these tools actually perform. The results? Surprising, frustrating, and occasionally impressive. 📝 Summarizing Local Government Meetings This is bread-and-butter work for many local journalists. Here's what we found: For short summaries (~200 words), tools like ChatGPT-4o, Claude Opus 4, and Perplexity Pro did surprisingly well, often capturing more facts (and hallucinating less) than the human-written summary we used for benchmarking. For longer summaries (~500 words), the quality dropped fast. On average, the tools retained only about 50% of the facts, hallucinated more, and missed key details. ChatGPT-4o had the most consistent and accurate output, with the lowest hallucination rate and best user experience. So: AI can help with quick recaps—if humans are verifying the work. But for more in-depth reporting, it still needs a human doing the work. 🔬 AI & Scientific Research: Not There Yet We also tested newer AI tools designed to help journalists and researchers make sense of academic work, especially tools that promise to surface related studies or verify the importance of a finding. Most tools surfaced less than 6% of the citations included in expert human literature reviews. Across the board, results were incomplete, or just plain wrong. 100% do not recommend (yet). Huge thanks to the brilliant team behind this work: Sophia Juco, Sandy Berrocal, Nneka Chile, Julia Kieserman, Jiayue Fan, Emilia Ruzicka, Mona Sloane, and Michael Morisy 🙌 (and anyone I may have missed!). I’m especially grateful for funding and support from the The Patrick J. McGovern Foundation, Vilas Dhar, and Nick Cain, who are deeply committed to journalism’s future. Next steps: If you’re experimenting with AI in your reporting—or you’ve read the piece and have thoughts—I’d love to hear from you. Drop a comment 👇 or shoot me a message. I'm also looking to connect with others interested in developing AI benchmarking standards for journalism, to help folks test tools more easily and responsibly. Burt Herman, Paul Cheung, Aimee, Nikita Roy, Silvia DalBen Furtado, Nicholas Diakopoulos, Jeremy Gilbert, and many others, I see you! #AIinJournalism #MediaTech #Journalism #AI SABEW Investigative Reporters and Editors Global Investigative Journalism Network Online News Association MuckRock Foundation, Tech Policy Press
-
Agentic AI journalism has arrived. According to the UK Press Gazette this morning, Mediahuis, one of Europe's largest news publishers with 25 titles across five countries, just revealed it's experimenting with a chain of AI journalism agents to produce routine "first-line" news. Not just one AI tool. A full agentic AI pipeline: commissioning, writing, legal checks, fact-checking, multimedia sourcing, and discourse monitoring - all handled by specialised AI agents before a human journalist reviews and publishes. The goal? Free their (currently) 2,000 human journalists to focus on "signature journalism" — investigations, interviews, community-connected depth reporting. What does this mean for PR and communications professionals? How long before: 1. Your press release may be triaged by AI first. Mediahuis is building curated source databases — wire agencies, parliaments, think tanks, political leaders on social. If your organisation isn't in those source pools in a structured, machine-readable way, you may not even make the first cut. Being findable by validated AI system sources may become as important as knowing the right journalist. 2. The two-tier newsroom needs a two-tier pitch strategy. Routine announcements will increasingly flow through AI-mediated workflows. But "signature journalism" — the pieces that build reputations and break stories — still requires human relationships. Know which tier your story belongs to, and invest your time accordingly. 3. AI monitoring is now part of the editorial cycle. Mediahuis's monitoring agent tracks public discourse around published stories. When polarisation spikes, it flags the topic for deeper editorial investigation. That means how audiences react to initial coverage can now algorithmically trigger follow-up journalism. The crisis response window just got shorter and more complicated (if that's possible). The multi-agent workflow Mediahuis describes - commissioning, producing, checking, monitoring - maps directly to how many PR teams operate. Is there an opportunity to apply similar thinking to comms content production: use AI for the routine, preserve human expertise for the strategic? Though fewer routine journalism roles will mean an even thinner pipeline of experienced reporters long-term. And if multiple publishers adopt similar AI systems drawing from the same source databases, do we risk even more homogenised news coverage? What happens when dealing with agentic AI journalism systems becomes the norm? What changes are you already seeing in how newsrooms handle incoming stories? As ever, welcome your comments below. Read the original Press Gazette article here: https://lnkd.in/eZ5_SgpS
-
The most surprising thing I've learned over the last few months of teaching generative #AI ethics to journalists, comms professionals and tech leaders from Chicago to Alabama to Cairo (and some South Koreans in between): Too few organizations have #AIEthics policies. That’s troubling because your audience wants AI policies, according to a large scale study Poynter Institute did with the Hubbard School of Journalism and Mass Communication - University of Minnesota. And it seems every week we witness an AI-related mishap at a news organization — which continues to fray the little trust we have left with our readers. Good news — it's actually pretty easy! Poynter has a guide I'll link in the comments. But here are a few tips: • Start an AI committee that is diverse across roles, backgrounds and levels of power in your organization. You don’t want only tech-savvy leaders making the rules. You should pick someone to lead the group who has good relationships across your organization (and a good sense of humor, they'll need it.) • Survey your audience about their AI literacy and comfort. Ask what worries them, what excites them and how transparent they want you to be. • Start with your values, not the tools. Spell out how AI should support your mission and journalism standards. • Map where AI is allowed – and not allowed – in your workflow from idea to publication. • Draw hard lines around sensitive work like source protection, corrections and editorial decisions. • Require human review for anything AI touches before it reaches your audience. • Develop good disclosures for AI use. Tell your readers how you use AI and how you don’t in language they can understand. The more automated and audience-facing (and therefore risky) your use, the more transparent you need to be. Change is hard. Especially in the media industry. But, AI guidelines can be a useful springboard to prepare your organization or team upheaval the technology will bring — and it’s easier than you think. Just do it!