One-click token theft, AI tutors beating professors, and GPU RAM used as swap

A compact daily digest: a critical GitHub token exploit, new Copilot model rollout, Stanford’s AI-in-education result, VRAM-as-swap tooling, and what engineers should do today.

A few simple clicks can still cause large failures — from browser-based IDEs leaking tokens to models reshaping professional work and creative hacks that repurpose GPU RAM as quasi‑memory. Today’s picks focus on immediate operational risk, efficiency plays, and practical tooling that engineers can act on now.

Top Signal

1-Click GitHub Token Stealing via a VSCode Bug

Why this matters now: github.dev users and anyone who opens repos in the in‑browser VS Code (or uses the vulnerable desktop flow) are at risk of losing their broad‑scope GitHub OAuth token with a single click.

Researchers disclosed a chain that turns a recommended‑extension flow into token theft on github.dev. The core weakness is a webview design decision: untrusted iframe content can synthesize keydown events and send them to the host via a did-keydown message. The exploit chains those synthetic events to accept an extension recommendation and either install or activate code that reads and exfiltrates the broad-scope GitHub token the site supplies. As the authors put it bluntly:

"Just by clicking a link, it’s possible for an attacker to steal a GitHub token that can read and write to your repos, including private ones."

The attack matters because github.dev issues a single, broad token for the session and — according to the writeup — lacks CSRF protections that would stop a remote link from redirecting a user into the malicious flow. Immediate mitigations the post suggests are practical: clear github.dev site data before use, avoid opening untrusted links into the site, and audit or uninstall suspicious local extensions. Longer-term fixes discussed in the community include issuing per-repo, narrowly scoped tokens (Codespaces already does this), stricter extension isolation, and rethinking whether extensions should be treated like full Node apps requiring tighter containment.

If you run github.dev regularly, treat this like a hot security bulletin: clear site storage, rotate tokens if you used the site recently, and audit your local extensions. The incident is also a reminder that browser‑hosted IDE convenience brings new attack surfaces that are still maturing.

AI & Agents

MAI‑Code‑1‑Flash lands in GitHub Copilot

Why this matters now: Microsoft’s MAI‑Code‑1‑Flash is now integrated into GitHub Copilot for VS Code, promising lower latency and fewer tokens for interactive coding assistance.

Microsoft says MAI‑Code‑1‑Flash was trained with real Copilot production harnesses and focuses on inference efficiency and adaptive solution length control. The company claims improved pass rates on internal SWE benchmarks and that the model spends less “budget” on easy prompts while allocating more compute to harder problems. The rollout is meant to cut latency and cost inside Copilot, and could change developer experience if the claims hold up in practice.

Community reaction is mixed. Commenters noted confusion around the model’s specs (a reported 137B parameters with only 5B active parameters for inference), and many argued Microsoft benchmarked selectively against specific competitor models rather than the strongest public alternatives. For teams evaluating Copilot, this is both a technical and commercial move: better latency matters, but so do licensing, pricing, and whether the model runs inside Azure as part of Microsoft’s larger strategy. See Microsoft’s announcement for details.

AI outperforms law professors in Stanford study

Why this matters now: Stanford’s study reports that law professors preferred AI answers to student contract‑law questions in about 75% of blind comparisons, suggesting AI could change tutoring and grading workflows now.

The paper ran 2,918 head‑to‑head comparisons judged by 16 professors and found AI responses “won” roughly three‑quarters of the time across 40 representative office‑hours questions. The lead author said the team was “frankly surprised by the magnitude of the results,” and the study also reported fewer pedagogically harmful flags for AI answers compared with peer-written ones.

"This study challenges important assumptions about AI’s role in legal education," the researchers wrote in their announcement.

That finding is provocative, but not definitive. Methodological caveats matter: the study’s sample of questions and models may not generalize across jurisdictions, and AI’s confident prose can conceal factual errors or hallucinated citations — a critical risk in legal work. For deans and educators, the takeaway is pragmatic: pilot AI as a tutoring aid, but build processes to verify citations and legal reasoning before students rely on it for high‑stakes work. Read the Stanford announcement for the study’s details and caveats.

World

CT scans of BYD car parts reveal heavy vertical integration

Why this matters now: BYD’s CT teardowns show the company manufactures roughly 75% of its vehicle components, explaining how it can outprice competitors and scale fast.

A small CT teardown examined a BYD LFP battery cell, a window switch panel, a portable EV charger, and a key fob to show internal construction and build quality. The images are interesting on their own, but they’re meaningful because they visually support BYD’s level of vertical integration — from cell design to vehicle assemblies — which is a competitive lever for cost and speed.

"We got our hands on four components from BYD's lineup and put them in our CT scanner," the writeup notes.

For automakers and parts suppliers, the scans are a cautionary signal: scale and control of upstream manufacturing can beat traditional supplier networks on price. Consumers and repair advocates should also note the tradeoffs — better cost and quality control can mean less repairability and more vendor lock‑in. See the CT scan writeup for the images and commentary.

Dev & Open Source

Use your NVIDIA GPU's VRAM as swap space on Linux (nbd‑vram)

Why this matters now: nbd‑vram turns idle NVIDIA VRAM into a block device used for swap, giving laptops with soldered RAM a usable memory boost today.

The project allocates VRAM via the CUDA driver and exposes it as a block device through the NBD (Network Block Device) protocol over a Unix socket. In practical tests, the author combined VRAM swap with zram and SSD swap and increased addressable memory substantially — for some workloads VRAM delivered much lower latency for page faults than a sleeping NVMe device.

"The daemon allocates VRAM via the CUDA driver API, then serves it as a block device using the NBD protocol," the repo explains.

A short explainer: NBD is a userspace way to present storage to the kernel; here it’s being used to let the OS treat VRAM like swap. That design sidesteps restricted GPU P2P APIs and works without kernel changes, but the tradeoffs are real: VRAM is precious for graphics, and reclaiming it quickly for a game or compositor could destabilize a desktop session. Community suggestions point to performance and stability improvements — for example, avoiding the userspace copy bounce with Linux’s ublk or tighter integration with the kernel. If you’re working on memory‑constrained Linux laptops with discrete NVIDIA GPUs, nbd‑vram is a pragmatic hack worth testing; follow the project repo for code and benchmarks.

The Bottom Line

Developer tooling and convenience continue to be the biggest attack surface: a single click can still yield full repo access, so containment and least‑privilege matter more than ever. At the same time, efficiency — whether in LLM inference or repurposing GPU RAM — is driving rapid, practical innovations. Patch operational gaps, experiment with promising tooling, and keep verification steps in place when AI starts to shape professional judgments.