On-device AI news roundup and best sources?

I’m trying to keep up with the latest on-device AI developments (like mobile, laptops, and edge devices), but my news feed is all over the place and mostly covers big cloud AI models. I need help finding reliable sites, newsletters, or forums that focus specifically on on-device AI news, benchmarks, and real-world use cases so I can follow this niche more closely. What are your best sources and how do you stay updated?

Your problem is super common. Most feeds focus on giant cloud models and ignore on device work. Here is a tight setup you can use.

  1. Daily / frequent news

• The Verge “AI” tag + “chips” tag
Good for Apple, Qualcomm, Intel, Google, laptop and phone stuff. Skim headlines, open only the hardware and OS related ones.

• Ars Technica “AI” + “hardware”
More technical. They cover NPUs, GPU changes, local inference, benchmarks.

• Tom’s Hardware “AI” and “Machine Learning”
Great for PC, edge boards, NPU cards, mini PCs. Look for benchmarks of things like RTX, NPU, Coral, Jetson.

• Android Authority + 9to5Google + 9to5Mac
Good for mobile on device AI in Android, Pixel, Samsung, Apple devices. Watch for terms like “on device”, “NPU”, “Neural Engine”, “Edge TPU”.

  1. Deep technical posts

• Apple Machine Learning Research
They publish posts on Core ML, Neural Engine, memory optimizations, quantization tricks for iPhone and Mac. Low volume, high signal.

• Google AI Blog + Google Research + Android Developers Blog
Search posts with “on device”, “mobile”, “Gemini Nano”, “Edge TPU”. Good for phones and Coral / Edge TPUs.

• Qualcomm, MediaTek, Intel, AMD, NVIDIA blogs
Follow their AI or developer sections. They announce NPUs in laptops, phones, embedded, plus SDK updates.

  1. Developer focused “firehose” sources

• Papers with Code, filter by “On-device”, “Mobile”, “TinyML”, “Edge”
Sort by “recent”. Good way to see current methods and benchmarks.

• Arxiv-sanity or Arxiv RSS
Subscribe only to cs.CV and cs.LG searches with “mobile”, “on-device”, “tinyml”, “quantization”, “edge”. Pipe RSS into Feedly.

• Hacker News
Search or watch for keywords: “on device”, “TinyML”, “llm.c”, “llama.cpp”, “WebGPU”, “webnn”, “mistral local”. Signal is mixed but comments reveal tools people use.

  1. Practical tooling / repos

Track these on GitHub and watch their releases.

• llama.cpp
Core project for running LLMs on laptops, phones, Raspberry Pi, etc.

• ollama
Desktop focused local model runner. Release notes show what runs on which GPUs/NPUs.

• mlc-ai / MLC-LLM
Focuses on WebGPU, phones, TVs, browsers.

• TensorFlow Lite, PyTorch ExecuTorch, GGML, vLLM (with edge work starting)
Release notes often include quantization, mobile backends, performance work.

  1. Curated newsletters

Your feed is noisy, so let someone else filter.

• Ben’s Bites
Heavy on cloud but always has some local / on device links. Quick skim.

• Import AI (R. Amodei)
More policy and big picture but touches on edge sometimes.

• Latent Space + Practical AI podcast
Interviews with folks doing local inference, quantization, on device UX. Audio is nice if you do not want more reading.

  1. Specific search queries that work

Set Google Alerts or use them weekly:

• “on device AI”
• “TinyML”
• “NPU performance laptop”
• “local LLM benchmark”
• “Gemini Nano on device”
• “Core ML quantization”
• “Apple Neural Engine benchmark”
• “Android on device model”

  1. Simple workflow to keep sane

• Use Feedly or an RSS app. Add: Ars Technica AI, The Verge AI, Apple ML, Google AI, Qualcomm blog, Intel AI, a couple newsletters that support RSS.
• Spend 10 minutes, 2 or 3 times a week. Star or tag only on device / edge posts.
• Once a month, check benchmarks from Tom’s Hardware or Notebookcheck for NPUs and GPUs. They test things like Ryzen AI, Intel AI Boost, Apple M-series, RTX, etc.

  1. A few anchor posts to start with

Search for these types of pieces. They give you a mental map so future news makes more sense.

• “A developer’s guide to on device LLMs on laptops”
• “Running large language models on iPhone with Core ML”
• “Gemini Nano on Pixel 8 Pro technical deep dive”
• “RTX vs NPU for local AI workloads”
• “TinyML on microcontrollers overview”

Once you set up RSS, a few alerts, and watch a handful of repos, your feed gets less random and more focused on on device stuff. The hard part is the first week of pruning, after that it is maintaince.

@mike34 covered a ton of the “classic” sources. I’d actually trim a bit instead of adding more feeds, otherwise you just recreate the same firehose problem in a different app.

A few things I’ve found useful that don’t totally overlap with what they listed:

  1. Two “meta” sources that surface a lot of on‑device stuff
  • Reddit (yeah, I know…):
    • r/LocalLLaMA
    • r/MachineLearning, filtered by flair “Deployment” or “Mobile / Edge”
    • r/StableDiffusion for local GPU / NPU tricks
      Sort by “top” for the last week, not “hot”. That kills a lot of hype and leaves more actually useful threads.
  • YouTube, but very cherry‑picked:
    Search and then subscribe only when the channel repeatedly covers local/edge:
    • “llama.cpp”
    • “run LLM locally”
    • “NPU benchmark M3 / Ryzen AI / Intel AI Boost”
      A few creators basically translate dense blog posts into experiments and benchmarks, which is way more actionable than another press releaese.
  1. Company material no one mentions, but is super relevant
  • Chrome / Web platform teams
    WebGPU + WebNN progress is basically “on‑device AI for everything with a browser.”
    • Check the Chrome Developers blog
    • Web.dev for posts with “WebGPU” or “on-device inference”
  • Microsoft
    Everyone talks about Copilot cloud stuff, but the Windows + NPU angle is sneaky important:
    • “Windows Dev Center AI”
    • Build / Ignite session videos on “Windows Studio Effects,” “NPU offload,” etc.
      Those sessions quietly spell out where laptop on‑device AI is headed.
  1. Benchmarks and reality checks
    I slightly disagree with leaning too hard on Tom’s Hardware alone. They’re good, but I’d pair them with:
  • Notebookcheck
    They test battery impact, throttling and sustained performance, which matters a lot more for on‑device than peak TFLOPs marketing.
  • MLPerf Tiny / Edge
    Not something you read daily, but when new rounds drop you get a very clean idea of which chips are real and which are just slideware.
  1. For staying out of the pure “research rabbit hole”
    Instead of going all‑in on arXiv RSS like @mike34 suggested, try:
  • Once‑a‑month skim of curated edge / TinyML papers
    • TinyML Foundation newsletters
    • “Awesome TinyML” and “Awesome Edge AI” GitHub lists
      People PR their own new work there, so you get a filtered subset of arXiv without refreshing 200 preprints.
  1. Opinionated “what actually works today” sources
    These aren’t official, but they save a ton of time:
  • Local LLM / SD aggregators
    • LM Studio, Ollama, Invoke, ComfyUI, etc. release notes
      They quietly telegraph which models, quantizations and accelerators are becoming “standard” for home and laptop setups.
  • Issue trackers over blogs
    It sounds boring, but watching issues for
    • llama.cpp
    • ExecuTorch
    • TensorRT‑LLM
      reveals real world on‑device pain points: memory limits, broken drivers, NPU quirks. That’s way more useful than another marketing post on “AI PCs”.
  1. How I’d structure a minimal, non‑insane workflow
    If you want this super lean:
  • RSS or follow:
    • 1 general tech site that respects hardware (Ars or similar)
    • 1 mobile‑focused site (Android dev / Apple dev)
    • Chrome / Web GPU blog
    • 1 TinyML / Edge newsletter
  • Once a week:
    • 10–15 min scan of those
    • 5 min of r/LocalLLaMA “top this week”
  • Once a month:
    • Notebookcheck AI laptop / phone reviews
    • MLPerf Tiny/Edge updates if there’s a new round

If adding a new source makes you check things more than twice a week, cut something else. The trick with on‑device news is not finding info, it’s not drowning in the cloud‑model hype flood mixed in with it.