AI News

Stay updated with the latest AI news and developments in artificial intelligence

November 19, 2025

Product

Manus Launches Browser Operator Chrome Extension: Transforms Any Browser into an AI-Powered Tool for Automated Tasks and Secure Access

Manus has released the Manus Browser Operator, a Chrome extension designed to convert any standard browser into an AI-enabled one. This tool automates complex browser operations, allowing access to protected websites and systems like research platforms and CRM tools without triggering additional login verifications. Currently in a phased rollout for advanced users, the extension aims to significantly boost daily work efficiency. Key features include secure local access, session reuse, and the ability to perform tasks such as data retrieval from databases (Crunchbase, PitchBook), CRM updates, and data extraction from paid platforms. The system operates with a dual-layer architecture, combining cloud-based browsing for general tasks with local browser access for authenticated systems, ensuring secure and efficient task execution. It is currently in beta for Pro, Plus, and Team users, supporting Chrome and Edge, with ongoing optimization for complex interactions.

Xiaohu.AI 日报
Technology

Google Unveils Antigravity: A New AI-Powered Autonomous Platform for End-to-End Software Development, Integrating with Gemini 3 for Agentic Coding

Google has launched Antigravity, a novel platform designed for "AI agent-led development," moving beyond traditional IDEs. This autonomous agent collaboration system enables AI to independently plan, execute, and verify complete software development tasks. Deeply integrated with the Gemini 3 model, Antigravity represents Google's key product in "Agentic Coding." It addresses limitations of previous AI tools, which were primarily assistive and required manual operation and step-by-step human prompts. Antigravity allows AI to work across editors, terminals, and browsers, plan complex multi-step tasks, automatically execute actions via tool calls, and self-check results. It shifts the development paradigm from human-operated tools to AI-operated tools with human supervision and collaboration. The platform's core philosophy revolves around Trust, Autonomy, Feedback, and Self-Improvement, providing transparency into AI's decision-making, enabling autonomous cross-environment operations, facilitating real-time human feedback, and allowing AI to learn from past experiences.

Xiaohu.AI 日报
Technology

Google Vids Unlocks Advanced AI Features for All Gmail Users: Free Access to AI Voiceovers, Redundancy Removal, and Image Editing

Google has made several advanced AI features in its Vids video editing platform available to all users with a Gmail account, previously exclusive to paid subscribers. These newly accessible tools include AI voiceovers, automatic removal of redundant speech, and AI image editing. The transcription trimming feature automatically eliminates filler words like "um" and "ah," along with long pauses, significantly enhancing video quality. Users can also generate professional-grade voiceovers from text scripts, choosing from seven different voice options, many of which sound natural. Additionally, the AI image editing tool allows for easy modifications such as background removal, descriptive editing, and transforming static photos into dynamic videos. Google aims to empower both beginners and experienced creators to produce high-quality video content, anticipating significant growth in the video editing market despite Vids being in its early stages.

AI新闻资讯 - AI Base
Technology

Quora's Poe AI Platform Launches Group Chat Feature Supporting Up to 200 Users for Enhanced Collaborative AI Interactions

Quora has introduced a new group chat feature for its AI platform, Poe, allowing up to 200 users to collaborate with various AI models and bots in a single conversation. This innovation supports multi-modal interactions including text, image, video, and audio generation. The launch coincides with OpenAI's ChatGPT piloting similar group chat functionalities in select markets, signaling a shift in AI interaction methods. Quora highlights that this feature will offer new interactive experiences for AI users, such as family trip planning using Gemini 2.5 and o3Deep Research, or team brainstorming with image models to create mood boards. Users can also engage in intellectual games with Q&A bots. Group chats can be created from Poe's homepage, with real-time synchronization across devices, ensuring seamless transitions between desktop and mobile. Quora developed this feature over six months and plans to optimize it based on user feedback, emphasizing the unexplored potential for group interaction and collaboration in AI mediums. Poe also enables users to create and share custom bots.

AI新闻资讯 - AI Base
Product

Google AI Developers Announce Immediate Availability of Gemini 3 for Builders

Google AI Developers have announced that Gemini 3 is now available for immediate use by developers. The announcement, made on November 19, 2025, encourages users to 'Start building with Gemini 3 today.' This brief update signifies the release of the new version of Gemini, making it accessible for development projects.

Google AI Developers(@googleaidevs) - Google AI Developers (@googleaidevs)
Technology

Google Research Unveils Generative UI: AI Now Creates Interactive Interfaces from Simple Prompts, Transforming User Experience in Gemini and Search

Google Research has introduced Generative UI, a groundbreaking interactive technology that enables AI models to generate complete, visual, and interactive user interfaces, including web pages, tools, games, and applications, from natural language prompts. This innovation expands AI's capability beyond mere content generation to full interactive experience creation. Integrated into Gemini App's 'Dynamic View' and Google Search's AI Mode, Generative UI addresses the limitations of traditional AI's linear text output, which struggles with complex knowledge and interactive tasks. The system allows AI to instantly design and implement functional interfaces, such as animated DNA explanations or social media galleries, rather than just providing textual descriptions. This feature is currently experimental in Gemini and available to Google AI Pro and Ultra users in the US for Search's AI Mode, leveraging tool access, system-level instructions, and post-processing for robust and safe interface generation.

Xiaohu.AI 日报
Technology

Google Unveils Gemini 3: A Leap in AI Reasoning, Multimodal Integration, and Agentic Behavior for Complex Understanding and Autonomous Task Execution

Google has officially launched Gemini 3, marking a significant advancement in AI capabilities. Defined by Google as a qualitative leap in higher-level reasoning, multimodal integration, and agentic behavior, Gemini 3 empowers AI with comprehensive abilities to understand complex scenarios, perform cross-modal analysis, and autonomously execute tasks. Key features include enhanced reasoning depth and problem decomposition, allowing it to understand the logic behind questions and break down complex tasks. Its 'Deep Think' mode achieved a 41% accuracy in human doctoral-level exams without tools, outperforming other public AI models. Gemini 3 also demonstrates significant progress in multimodal understanding across images, video, audio, and code. A major breakthrough is its agentic capabilities, supported by the new Google Antigravity platform, enabling AI to plan, code, execute, and verify tasks autonomously. Furthermore, Gemini 3 boasts scalable learning and long-horizon planning with million-token context understanding, capable of managing multi-step scenarios consistently. These advancements position Gemini 3 for applications in learning, building, and planning across various domains.

Xiaohu.AI 日报
Technology

xAI Launches Grok 4.1 with Enhanced Performance and Reduced Hallucinations on Web and Apps, Lacks API Access for Enterprise

Elon Musk's xAI has released Grok 4.1, its newest large language model, now available for consumer use on Grok.com, X, and its mobile apps. This launch, preceding Google's Gemini 3, introduces significant architectural and usability improvements, including faster reasoning, improved emotional intelligence, and notably lower hallucination rates. Grok 4.1 has achieved top rankings in public benchmarks, surpassing models from Anthropic, OpenAI, and Google's pre-Gemini 3 models. A white paper detailing its evaluations and training process has also been published. However, a key limitation for enterprise developers is the current absence of API access for Grok 4.1, restricting its integration into production environments. Only older xAI models are presently available via the developer API, supporting up to 2 million tokens of context.

VentureBeat
Technology

Google Unveils Gemini 3: Claims Global AI Leadership in Math, Science, Multimodal, and Agentic Benchmarks, Surpassing Competitors

Google has officially launched Gemini 3, its latest proprietary frontier model family, marking its most comprehensive AI release since the Gemini line debuted in 2023. Available exclusively through Google products and developer platforms, Gemini 3 includes the flagship Gemini 3 Pro, Gemini 3 Deep Think for enhanced reasoning, generative interface models, and Gemini Agent for multi-step tasks. Independent AI benchmarking organization Artificial Analysis has crowned Gemini 3 Pro the "new leader in AI" globally, achieving a top score of 73 on its index, a significant leap from Gemini 2.5 Pro's 9th place. LMArena also reported Gemini 3 Pro as the world's top model across text reasoning, vision, coding, and web development, outperforming Grok-4.1, Claude 4.5, and GPT-5-class systems in various categories.

VentureBeat

November 18, 2025

Technology

ElevenLabs Unveils "Image & Video Platform": A Super AI Content Factory for Integrated Visuals, Audio, and Music Generation, Revolutionizing Content Creation Workflow

ElevenLabs, a leader in multimodal AI, has launched its new "Image & Video Platform," transforming from a voice-only tool into a comprehensive AI content factory. This platform integrates image generation, video generation, voice synthesis, music creation, and sound effect design, enabling creators and marketers to produce commercial-grade videos from script to final product within a single interface. It eliminates the need for switching between multiple platforms by seamlessly combining visual generation with ElevenLabs' audio capabilities. The platform incorporates top multimodal models like Google Veo, OpenAI Sora, and Kling, alongside ElevenLabs' proprietary AI voice and music generation. Designed for commercial use, it supports various aspect ratios, includes a commercial-safe audio library, offers multi-language narration replacement, and features a timeline editor for precise synchronization. Official demonstrations show a 30-second brand advertisement can be created in just five minutes, significantly boosting content production efficiency.

AI新闻资讯 - AI Base
Technology

Google DeepMind Unveils SIMA 2: A General-Purpose AI Agent Powered by Gemini, Achieving Near-Human Performance in Complex 3D Virtual Worlds with Enhanced Reasoning and Self-Improvement

Google DeepMind has launched SIMA 2, an upgraded general-purpose AI agent designed to navigate and perform tasks in complex 3D game environments. Building on its predecessor, SIMA 1 (released in 2024), SIMA 2 integrates the Gemini 2.5 Flash Lite model as its core reasoning engine, enabling it to better understand goals, interpret plans, and continuously learn through self-improvement. While SIMA 1 achieved a 31% task completion rate with over 600 language instructions, SIMA 2 significantly boosts this to 62%, nearing the 71% completion rate of human players. SIMA 2 maintains the same interface but transforms from a mere instruction executor into an interactive game partner, capable of explaining its intentions and answering questions about its goals. It also expands its instruction channels to include voice, graphics, and emojis, demonstrating advanced reasoning by interpreting abstract requests. Furthermore, SIMA 2 features a self-improvement mechanism where it learns from its own experience in new games, with the Gemini model generating and scoring new tasks, leading to success in previously failed scenarios without additional human demonstrations. DeepMind also showcased SIMA 2's integration with Genie 3, allowing it to generate interactive 3D environments from a single image or text prompt, marking a significant step towards advanced real-world robotics.

AI新闻资讯 - AI Base
Product

ElevenLabs Unveils Image & Video (Beta): An All-in-One AI Content Creation Platform for Visuals, Audio, and Music Generation

ElevenLabs has officially launched Image & Video (Beta), a comprehensive AI content creation platform designed for creators and marketers. This integrated platform combines image, video, sound, music, and sound effect generation capabilities. It leverages leading multimodal generative models such as Veo, Kling, and Sora to enable rapid visual content creation. Users can directly synthesize voices, overlay narrations, and edit soundtracks within the ElevenLabs platform, producing commercial and creative video content. The platform supports a streamlined workflow, including image/video generation, audio/voiceover addition with lip-sync, background music and sound effect editing, multi-segment synthesis, and ultra-resolution enhancement via Topaz integration. It aims to provide a unified creative environment, eliminating the need for multiple tools and catering to content creators, marketing teams, educators, and game developers.

Xiaohu.AI 日报
Technology

Poe AI Launches Group Chat Feature: Up to 200 Users Collaborate with 200+ AI Models for Enhanced Interactive Experiences

Poe, a prominent AI platform, has officially introduced a new 'Group Chat' feature, integrating multi-model AI with real-time multi-person interaction. This innovation allows up to 200 users to join a single chat and collaborate with hundreds of AI models for diverse scenarios like travel planning and creative brainstorming. The feature supports seamless collaboration with any AI model, eliminating the need to switch tools and enabling simultaneous use of various AI types within one chatroom. It is compatible with over 200 AI models, including text, image, video, and audio, and allows custom bot integration. Users can mix and match top models like GPT-5.1, Claude 4.5, Gemini 2.5, Sora 2, and Veo 3.1 for comprehensive content creation. The group chat also offers cross-device synchronization for desktop and mobile, ensuring uninterrupted collaboration. This feature is poised to transform enterprise meetings, online education, and virtual communities, making advanced AI resources more accessible.

AI新闻资讯 - AI Base
Technology

xAI Unveils Grok 4.1: A Leap in Emotional Intelligence and Personality Coherence for AI Models, Outperforming Rivals in Key Benchmarks

xAI has officially launched Grok 4.1, aiming for a more natural and credible AI experience beyond a mere question-answering machine. The update significantly enhances creativity, emotional intelligence, and collaborative capabilities, focusing on nuanced intent understanding and consistent personality. Grok 4.1 utilizes a large-scale reinforcement learning (RL) infrastructure and innovative agentic reasoning models as reward models for self-improvement. Key advancements include novel reward modeling, where high-order reasoning models automatically review and refine responses, reducing reliance on manual annotation. It also introduces 'Personality Alignment' as an optimization goal, adding emotional expression rewards and personality coherence metrics to its training. This ensures the model maintains a stable identity and consistent tone across conversations, fostering a sense of continuity for users. Performance assessments show Grok 4.1 leading LMArena's text榜单 (Text Arena) with 1483 Elo, outperforming Gemini 2.5 Pro, Claude, and GPT-4.5. It also achieved the highest scores in EQ-Bench3 for emotional empathy and human-like responses and ranked second only to GPT-5 series in Creative Writing v3 Benchmark. Furthermore, Grok 4.1 reduced information error rates by approximately 65% and hallucination occurrences by three times.

Xiaohu.AI 日报

November 17, 2025

Technology

Qwen-Edit-2509-Multi-angle Lighting LoRA Released by Qwen for Enhanced Image Editing Capabilities

Qwen has announced the release of 'Qwen-Edit-2509-Multi-angle lighting LoRA,' a new tool designed to enhance image editing. The announcement, made via Twitter by @Qwen - Qwen, highlights the availability of this LoRA (Low-Rank Adaptation) model. Users can download 'Qwen-Edit-2509-Multi-angle lighting LoRA' from Hugging Face, with the download link provided as https://huggingface.co/dx8152/Qwen-Edit-2509-Multi-Angle-Lighting. This release is attributed to '大雄' and is associated with @Ali_TongyiLab.

Twitter @Qwen - Qwen
Technology

Elon Musk Announces 'Just Grok 4': AI Demonstrates Emergent Intelligence by Redesigning Edison Lightbulb Filament

Elon Musk, via Twitter, announced 'This is just Grok 4,' highlighting a significant advancement in AI. The announcement follows a demonstration where Grok analyzed Thomas Edison's 1890 lightbulb patent, subsequently determining and implementing a superior filament design that successfully illuminated a light. This emergent intelligence, described as unique among current AI models, has been noted for its potential to revolutionize education and enable robots to construct.

twitter-Elon Musk
Technology

DeepMind Unveils SIMA 2: A Gemini-Powered AI Agent Capable of Reasoning, Learning, and Playing in Diverse 3D Virtual Worlds, Advancing Towards Embodied AGI

DeepMind has launched SIMA 2, an advanced version of its Scalable Instructable Multiworld Agent, significantly evolving from its predecessor. While SIMA 1 could execute over 600 language instructions across various 3D virtual worlds by observing screens and using virtual keyboard/mouse, SIMA 2, powered by the Gemini large language model, transcends mere execution. It can now reason about user goals, explain its plans and thought processes, learn new behaviors, and generalize experiences across multiple virtual environments. This leap is driven by a Gemini-integrated core that combines language, vision, and reasoning, enabling SIMA 2 to understand high-level tasks, translate natural language into action plans, and explain its decisions in real-time. Trained through human demonstrations and AI self-supervision, SIMA 2 demonstrates remarkable cross-game generalization, applying learned concepts to new tasks and operating in previously unseen commercial open-world games. It also supports multimodal instructions and can autonomously navigate and complete tasks in dynamically generated 3D worlds, showcasing a self-improvement loop for continuous learning without human feedback. DeepMind positions SIMA 2 as a significant step towards Embodied General Intelligence.

Xiaohu.AI 日报
Technology

Saudi AI Startup Humain Unveils 'Humain One' AI Operating System, Revolutionizing Computer Interaction with Natural Language Commands

Saudi Arabian AI startup Humain has officially launched 'Humain One,' a new AI operating system, at the 9th Future Investment Initiative conference in Riyadh. This system aims to replace traditional icon-based operating systems like Windows and macOS, allowing users to interact with computers through natural language commands to complete various tasks. Humain CEO Tariq Amin stated that the company is redefining enterprise computing by creating an AI partner that understands user goals, anticipates needs, and autonomously executes tasks. The operating system, driven by Humain's agent orchestration engine and powered by the Arabic-centric language model 'Allam,' is designed to enhance productivity and creativity across enterprise roles. This launch aligns with Saudi Arabia's accelerated push for AI development, aiming for a leading global market position. Humain, established in May by the country's sovereign wealth fund with Crown Prince Mohammed bin Salman as chairman, is part of Saudi Arabia's 'Vision 2030' to become a 'global AI powerhouse.'

AI新闻资讯 - AI Base
Technology

AI Agent Aggregation Platform MuleRun 2.0 Surpasses 500,000 Users in First Month, Led by US Adoption

MuleRun, an AI Agent aggregation platform, announced that its 2.0 version has attracted over 500,000 registered users globally within just one month of its launch, with the United States accounting for the highest percentage of users. The platform introduces an innovative 'Agent Team' model, allowing users to select a professional identity, after which the system recommends and enables the assembly of multiple vertical Agents. These Agents collaborate to complete complex tasks such as e-commerce operations, data analysis, and content creation. MuleRun 2.0 currently integrates hundreds of applications, including Alibaba International Station's PicCopilot, Quick BI official report analysis Agent, and Sora video generation, covering scenarios like product image generation, anomaly detection, and short video production. The platform supports Python/SQL code traceability, claims 'zero hallucination risk,' and plans to launch subscription payment and enterprise private deployment solutions next month. Analysts suggest MuleRun consolidates disparate AI capabilities into a 'super toolbox,' lowering the barrier for general users, but highlights the need to address compliance and copyright risks.

AI新闻资讯 - AI Base
Technology

Google's Gemini Veo 3.1 Launches 'Ingredients to Video' Mode for Pro/Ultra Subscribers: Create 8-Second 1080p Videos from Three Reference Images with Consistent Characters and SynthID Watermarks

Google has rolled out the Veo 3.1 video model to Gemini Pro/Ultra subscribers, introducing a new 'Ingredients to Video' mode. This feature allows users to upload three reference images simultaneously to extract character, scene, and style characteristics, which are then merged into an 8-second, 1080p video. The generated content includes an invisible SynthID watermark. Users can create videos via text prompts on web or mobile, with the system maintaining cross-frame character consistency and lighting coherence. Google demonstrated this by combining three selfies, a cyber city background, and an oil painting style image to produce a 'futuristic impressionist street walk' short film with no facial or clothing deformation. Veo 3.1 also outputs native environmental sound and supports first/last frame control and video extension. The multi-image reference feature is fully available, utilizing existing subscription quotas without additional payment plans announced.

AI新闻资讯 - AI Base
Industry News

Alibaba Launches 'Qianwen' App for Public Beta, Leveraging Qwen3 to Challenge ChatGPT in the AI-to-Consumer Market with Free Access and Ecosystem Integration

On November 17th, Alibaba officially launched the public beta of its 'Qianwen' app, marking its full entry into the AI-to-C market. Based on the globally top-performing open-source model Qwen3, Qianwen aims to compete directly with ChatGPT by offering free access and integrating with various lifestyle scenarios. Alibaba's core management views this project as the 'future battle of the AI era.' The app is now available on major app stores, with web and PC versions also provided. An international version is slated for release soon to leverage Qwen's global influence and compete for overseas users. Alibaba has invested 380 billion yuan in AI infrastructure this year and plans to expand cloud data center energy consumption tenfold by 2032. Qwen has become the most powerful and widely used open-source large model globally, with over 600 million downloads. Its flagship model, Qwen3-Max, reportedly surpasses GPT5 and Claude Opus4 in performance, positioning it among the top three globally. Qianwen's strategic goal is to become a future AI lifestyle portal, offering smart chat and practical task-handling capabilities, such as generating research reports and PPTs. Alibaba plans to integrate various life scenarios like maps, food delivery, ticketing, and shopping into the app.

AI新闻资讯 - AI Base
Product

Xiaomi Unveils Open-Source 7B Multimodal Model MiMo-VL and AI Butler Miloco for Automated Smart Home Control

Xiaomi has launched its 7B parameter multimodal large model, 'Xiaomi-MiMo-VL-Miloco-7B-GGUF,' on Hugging Face and GitHub, alongside an AI butler named 'Xiaomi Miloco.' This system leverages Mijia cameras to identify user activities like gaming, fitness, or reading, and gestures such as victory signs or thumbs-up. Miloco then automatically controls smart home devices including lights, air conditioners, and music, while also supporting the Home Assistant protocol. Operating under a non-commercial open-source license, Miloco can be deployed with a single click on Windows or Linux hosts equipped with NVIDIA GPUs and Docker. Examples include automatic desk lamp activation for reading, climate control adjustments based on bedding during sleep, and personalized voice comments upon entry based on clothing style. Xiaomi has released the model weights and inference code but retains intellectual property, prohibiting commercial use.

AI新闻资讯 - AI Base
Technology

Google's Gemini 3.0 Set for Late 2025 Launch, Aiming to Challenge ChatGPT with Major Breakthroughs in Code Generation and Multimodal AI

Google CEO Sundar Pichai has confirmed the official release of the Gemini 3.0 large language model by the end of 2025. This new iteration is expected to deliver significant advancements in code generation, multimodal creation, and reasoning capabilities, sparking considerable discussion within the global AI community. Gemini 3.0 will integrate an upgraded image generation engine, Nano Banana, to compete with Sora and DALL·E, and will feature enhanced multi-language, multi-file collaborative coding and debugging. Leveraging Google's TPU v5 chips and Vertex AI, it aims for improved response speed and cost efficiency. Despite Gemini's 650 million monthly active users, it trails ChatGPT's 800 million weekly active users. Google's strategy involves deep integration with Android 16, Pixel devices, Workspace, and Google Cloud to create a comprehensive AI ecosystem, with the goal of transforming users into deep Gemini adopters and reclaiming leadership in generative AI.

AI新闻资讯 - AI Base
Technology

Google Gemini Update: New Multi-Reference Image Feature Empowers Users with Enhanced Control Over AI Video and Audio Generation

Google has rolled out an update for its Gemini application, introducing a novel method for AI video generation control. Users can now upload multiple reference images within a single video prompt, allowing the system to generate video and audio based on these images and accompanying text. This new functionality grants users more direct control over the final visual and auditory output of their videos. Previously, Google had tested this feature within its extended video AI platform, Flow, which offers higher video quotas and supports extending existing video clips and stitching multiple scenes. The Veo 3.1 version, released in mid-October, reportedly shows significant improvements in texture realism, input fidelity, and audio quality compared to Veo 3.0. This update aims to provide creators with greater flexibility and customization in AI-powered content creation.

AI新闻资讯 - AI Base
Technology

AI Prompt Engineering: How to Avoid Being Perceived as a 'Low-Intelligence PhD Student' by AI

A recent social media post highlights an effective AI prompt for explaining complex topics. The original prompt, praised for its good results, is 'Please help me explain this paper in simple terms to high school students.' The author suggests this phrasing helps avoid being perceived by AI as a 'low-intelligence PhD student.' Another user commented that this prompt is applicable to 'all new field research,' indicating its broad utility for simplifying advanced concepts for a general audience.

twitter-宝玉
Research

Groundbreaking Research Reveals Short-Form Video Platforms Like TikTok and Instagram Are Altering Human Brains and Cognition, According to Griffith University Study

A significant study by a research team at Griffith University in Australia has found that short-form video platforms such as TikTok, Douyin, and Instagram are subtly changing the human brain. Analyzing 71 studies involving 98,299 participants, the research investigated the relationship between short-form video use and 'cognition' and 'mental health.' Key findings indicate that increased short-form video consumption correlates with poorer overall cognitive levels, severe declines in attention span, significantly weakened self-regulation, and reduced memory capacity. The study highlights how constant exposure to fast-paced content habituates the brain, making it prone to distraction during slower tasks and over-activating the dopamine reward system, leading to a pursuit of instant gratification and a decline in patience and deep thinking. Furthermore, it details negative emotional effects, including emotional regulation imbalance and intensified social comparison, contributing to increased anxiety, loneliness, and disrupted sleep patterns.

twitter-小互
Technology

Michael Dell Praises Grok 5 AI Model: Cites 6 Trillion Parameters and High Intelligence Density, Anticipates Exciting 2026

Michael Dell has retweeted a statement highlighting the impressive specifications of the Grok 5 AI model. According to Dell, Grok 5 boasts a 'massive 6 trillion parameters' and exhibits 'much higher intelligence density.' He expressed his anticipation for 2026, suggesting it 'is going to be exciting!' Dell also extended thanks to Elon Musk in his statement, indicating a connection to the development or announcement of the Grok 5 model.

twitter-Elon Musk

November 16, 2025

Research

Groundbreaking Research Reveals Short-Form Video's Impact on Brain and Cognition: A Comprehensive Meta-Analysis by Griffith University

A significant study from Griffith University's Department of Psychology has provided the most comprehensive analysis to date on how short-form video platforms like TikTok, YouTube Shorts, and Instagram Reels are altering human cognition and mental health. Analyzing 71 studies with 98,299 participants, this systematic review and meta-analysis, published in 'Psychological Bulletin,' investigates the relationship between short-form video use and cognitive functions, as well as mental well-being. The research highlights that the high-speed visual stimulation, infinite scrolling, algorithmic recommendations, and personalized content delivery inherent in these platforms constitute an 'addictive design architecture.' Key findings indicate that frequent exposure to fast-paced content leads to habituation, making the brain less responsive to slower tasks, and sensitization, increasing dependence on immediate gratification. This process, driven by the dopamine reward system, is linked to reduced attention span, decreased patience, and shallower thinking, potentially impacting the prefrontal cortex and attention networks.

Xiaohu.AI 日报
Technology

ChatGPT's Excessive Dash Usage: Sam Altman Announces 'Cure' for AI's 'Watermark' Habit

ChatGPT has been known for its frequent use of dashes, a stylistic quirk that has become so prevalent it's been dubbed an 'AI watermark.' Sam Altman recently announced that this particular 'ailment' has now been 'cured.' The original news, published by Qbitai on November 16, 2025, with author Krecie, briefly highlights this development.

量子位 - 克雷西
Technology

Kimi K2 Thinking Achieves Top Performance on Vending-Bench, Outperforming Open-Source Models with Moonshot API Integration

Kimi.ai announced that its Kimi K2 Thinking model has become the leading open-source model on the Vending-Bench benchmark. This improved performance was observed after re-running the model using Moonshot's own API, a method suggested to enhance tool calling capabilities. The re-evaluation by Andon Labs confirmed that integrating with the Moonshot API significantly boosted Kimi K2's average net worth achieved on the benchmark, solidifying its position as the top performer among open-source alternatives.

twitter-Kimi.ai
Industry News

AI Startup Gamma's Grant Lee Shares 8 Proven Product & Growth Strategies: Serving 50 Million Users with a 50-Person Team

A summary of a deep dive conversation between Lenny Rachitsky and Grant Lee, founder of AI startup Gamma, reveals eight battle-tested product and growth strategies. Gamma, a company that serves 50 million users with a 50-person team and is profitable, emphasizes perfecting the initial 30-second product experience, focusing on a single core value, and delaying advertising until organic word-of-mouth growth exceeds 50%. Other key takeaways include collaborating with hundreds of micro-influencers, personally onboarding early creators, slow and deliberate hiring of top talent, rapid prototyping for idea validation, and committing to long-term problems.

twitter-宝玉
Industry News

Yann LeCun and Google DeepMind's Dr. Adam Brown to Discuss AI's Future Amidst Large Language Model Debate at Pioneer Works

Yann LeCun, a foundational figure in modern AI, will engage in a conversation with Dr. Adam Brown from Google DeepMind at Pioneer Works. The discussion, hosted by Janna Levin, comes as LeCun expresses his conviction that many in the AI field have been misguided by the focus on large language models. This event highlights a critical debate within the AI community regarding the direction and future of artificial intelligence development.

twitter-Yann LeCun
Technology

Elon Musk Praises Grok's 'Eve' Voice as 'So Beautiful,' Users Agree on Its Quality

Elon Musk, via a post on X (formerly Twitter), has highly recommended trying the 'Eve' voice feature of Grok, describing it as 'so beautiful.' This endorsement was echoed by a user, 'Mcmxt,' who stated that Eve is 'legit one of the best voices ever' and their preferred choice for Grok. The brief interaction highlights positive user reception and Musk's personal appreciation for Grok's voice capabilities.

twitter-Elon Musk
Product

Elon Musk Announces Easy Voice Style and Speed Customization for Grok's Voice Mode, Featuring Six Distinct Personalities

Elon Musk has revealed that Grok's Voice Mode offers users the ability to easily change voice styles and speeds. The feature includes six distinct voice options: Ara (Upbeat Female), Eve (Soothing Female), Leo (British Male), Rex (Calm Male), Sal (Smooth Male), and Gork (Lazy Male). Users can access these settings by tapping the settings icon within Grok's Voice Mode. Additionally, the speed of the chosen voice can also be adjusted, enhancing user customization.

twitter-Elon Musk
Technology

Beyond the Hype: Why Current AI Fears Miss the Mark on the Impending Intelligence Revolution, Not Just Automation

Many investors are misinterpreting the current AI landscape, confusing AI automation with true AI intelligence, leading to unfounded fears of an 'AI bubble.' However, historical trends and recent advancements suggest we are on the cusp of an irreversible AI revolution. YC-backed startups demonstrate that small teams leveraging real intelligence models can outperform larger entities, exemplified by OpenAI's ChatGPT surpassing Google despite its vast resources. This is because intelligence scales non-linearly, unlike automation which plateaus. The next major leap is anticipated from AI systems integrating mathematical architectures with quantum computing, enabling real-time simulation of complex global systems. This transition signifies a shift from rule-based automation to emergent intelligence, where AI understands, decides, optimizes, and evolves, fundamentally changing the economic engine.

newest submissions : artificial

November 15, 2025

Product

NotebookLM Reaches Milestone: Now Supports Image Data Sources for Enhanced Information Retrieval

NotebookLM has achieved a significant milestone by integrating support for image data sources. This new capability allows users to upload and retrieve information from various image types, including classroom whiteboard notes, textbook content, tables, and even impromptu street photographs. This feature is anticipated to be particularly beneficial for students and individuals attending lectures, offering a versatile new way to manage and access visual information.

歸藏(guizang.ai)(@op7418) - 歸藏(guizang.ai) (@op7418)
Technology

Google Cloud and UCLA Introduce Supervised Reinforcement Learning (SRL) to Empower Smaller AI Models with Advanced Multi-Step Reasoning Capabilities

Researchers from Google Cloud and UCLA have unveiled Supervised Reinforcement Learning (SRL), a novel reinforcement learning framework designed to significantly enhance the ability of language models to tackle complex multi-step reasoning tasks. SRL redefines problem-solving as a sequence of logical actions, providing rich learning signals during training. This innovative approach allows smaller, more cost-effective models to master intricate problems previously beyond the scope of conventional training methods. Experiments demonstrate SRL's superior performance on mathematical reasoning benchmarks and its effective generalization to agentic software engineering tasks. Unlike traditional Reinforcement Learning with Verifiable Rewards (RLVR), which offers sparse, outcome-based feedback, SRL provides granular feedback, addressing the learning bottleneck faced by models struggling with difficult problems where correct solutions are rarely found within limited attempts. This enables models to learn from partially correct steps, fostering higher reasoning abilities in less expensive models.

VentureBeat
Technology

NVIDIA Earth-2 and CorrDiff Achieve 50x Speedup in Weather Prediction with Gen AI Super-Resolution for Scalable AI Models

Generative AI super-resolution is significantly accelerating weather prediction, achieving a 50x speedup through the integration of NVIDIA Earth-2 and CorrDiff. This advancement enables the development of low-compute, scalable AI models, leading to faster training times and the capability for real-time predictions. The technology promises to revolutionize how weather forecasts are generated and delivered, making them more efficient and accessible.

Twitter @NVIDIA AI Developer - NVIDIA AI Developer
Technology

New Foundational AI Model Leverages Supercomputing for Early Detection of Rare Cancers from 3D Medical Imaging Data

A new foundational AI model, developed by TU/e's team using the SPIKE-1 supercomputer, is capable of adapting to identify early signs of rare cancers. Medical imaging generates vast amounts of 3D data that are challenging to analyze comprehensively for disease detection, particularly for rare cancer types. By utilizing SPIKE-1, which boasts approximately 100 times the computing power of its predecessor, the team created a versatile AI model trained on over 250,000 CT scans. This innovation aims to enable faster and more accurate cancer detection. TU/e is also making these state-of-the-art tools open source to foster global collaboration and significantly advance rare cancer research and healthcare innovation worldwide.

Twitter @NVIDIA AI Developer - NVIDIA AI Developer
Technology

Meta Tech Podcast Explores How Open Hardware and AI Drive Environmental Sustainability, Featuring OCP Summit 2025 Announcements and Net Zero Goals

The latest Meta Tech Podcast episode features Pascal Hartig, Dharmesh, and Lisa discussing the environmental benefits of open-source software and the emerging field of open hardware. The discussion highlights Meta's key announcements from the 2025 Open Compute Project (OCP) Summit, including a new open methodology utilizing AI to analyze Scope 3 emissions. The podcast delves into OCP's history and its growth to over 400 contributing companies. Listeners will learn how AI and open hardware are instrumental in Meta's pursuit of net-zero emissions by 2030, specifically mentioning AI's role in developing innovative concrete mixes for data center construction. The episode is available on Spotify, Apple Podcasts, and Pocket Casts.

Engineering at Meta
Industry News

OpenAI Criticizes Court Order Granting NYT Access to 20 Million User Chats

OpenAI has expressed strong disapproval of a recent court order that permits the New York Times to review 20 million complete user chat logs. The details surrounding the court's reasoning for this decision, the specific context of the dispute between OpenAI and the New York Times, and the potential implications for user privacy or data security are not provided in the original submission. The news was submitted by u/F0urLeafCl0ver on November 14, 2025, to the r/artificial subreddit.

newest submissions : artificial
Technology

Databricks Unveils 'ai_parse_document' to Tackle Unsolved PDF Parsing for Agentic AI, Streamlining Enterprise Data Extraction

Databricks has introduced 'ai_parse_document' technology, integrated with its Agent Bricks platform, aiming to resolve the persistent challenge of accurately parsing complex PDF documents for enterprise AI. Despite common assumptions, extracting structured data from enterprise PDFs, which often combine digital content, scanned pages, tables, and irregular layouts, remains largely unsolved by existing tools. This bottleneck hinders enterprise AI adoption, as approximately 80% of enterprise knowledge is locked in these difficult-to-process documents. Current workarounds involve stacking multiple specialized tools, leading to significant custom data engineering and maintenance. Databricks' new tool seeks to replace these multi-service pipelines with a single function, addressing issues like dropped or misread tables, figure captions, and spatial relationships that compromise downstream AI applications and RAG systems.

VentureBeat

November 14, 2025

Technology

TrendRadar: AI-Powered News Hotspot Aggregation and Public Opinion Monitoring Tool for Multi-Platform Insights

TrendRadar, developed by sansan0, is an AI-driven tool designed to combat information overload by providing simple public opinion monitoring and analysis. It aggregates hot topics from 35 platforms, including Douyin, Zhihu, Bilibili, Wall Street Insights, and Cailian Press. The tool offers intelligent filtering, automatic push notifications, and AI-powered conversational analysis with 13 tools for deep news mining, such as trend tracking, sentiment analysis, and similarity search. TrendRadar supports notifications via WeChat Work, Feishu, DingTalk, Telegram, email, and ntfy. It boasts quick deployment with 30-second web setup and 1-minute mobile notifications, requiring no programming. Docker deployment is also supported, aiming to leverage AI for understanding hot topics and making algorithms serve users.

GitHub Trending
Industry News

Apple's Mini Apps Partner Program: A Glimpse into Future Developer Opportunities (Hacker News Discussion)

The original news content, sourced from Hacker News and referencing Apple's developer program page for 'Mini Apps Partner,' consists solely of the word 'Comments.' This indicates that the initial announcement or discussion around Apple's Mini Apps Partner Program has generated user commentary. Without further details, the specific nature of the program, its objectives, or the content of the comments remains unknown. The program's existence suggests Apple is exploring new avenues for app development or integration, potentially similar to 'mini-programs' seen on other platforms, aimed at enhancing user experience or developer engagement. The lack of additional information in the original news means any further elaboration would be speculative.

Hacker News
Product

OpenAI Launches Apps for ChatGPT Business and Enterprise Plans, Enhancing Workspace Deployment with Developer Mode

OpenAI has announced the availability of apps for its ChatGPT Business and Enterprise plans. Users on these plans can now leverage developer mode to test and deploy applications within their workspaces. This new feature aims to provide enhanced functionality and customization options for businesses utilizing ChatGPT.

OpenAI Developers(@OpenAIDevs) - OpenAI Developers (@OpenAIDevs)
Technology

Tweeks (YC W25) Chrome Extension Leverages LLMs for Automated Userscript Generation, Sparks Debate on Privacy, Legality, and Open Source

Tweeks, a YC W25 Chrome extension, aims to 'de-enshittify' the web by automatically generating userscripts using Large Language Models (LLMs), similar to Greasemonkey/Tampermonkey. The extension captures current page content for LLM generation, with the resulting static scripts running locally. Key discussions revolve around technical feasibility, particularly with complex web structures and Manifest V3, and significant privacy concerns due to sending page content to LLMs during generation and the broad permissions required. Legal and platform risks, including potential site bans or lawsuits, are also central, with historical precedents like FB Purity cited. The business model and the extent of open-sourcing are debated, with the founders expressing caution about full open-source due to potential replication by larger entities. While users praise its ease of use for customization, the team acknowledges reliance on manual testing for accuracy and is exploring local small models for future cost and privacy improvements. The founders have disclosed DPA agreements with LLM providers regarding data retention and SOC II compliance.

News Hacker
Product

Perplexity Pro and Max Subscribers Gain Access to GPT-5.1

Perplexity has announced that GPT-5.1 is now accessible to its Perplexity Pro and Max subscribers. This update was shared via their official Twitter account on November 13, 2025. The announcement, which included a video tag (though the video content itself is not provided in the original text), indicates an enhancement to the features available for their premium user base.

Perplexity(@perplexity_ai) - Perplexity (@perplexity_ai)
Industry News

OpenAI and Microsoft Partner with State Law Enforcement on AI Safety Task Force

OpenAI and Microsoft have reportedly joined forces with state law enforcement agencies to establish an AI safety task force. This collaboration aims to address critical issues surrounding artificial intelligence safety and its implications. Further details regarding the specific objectives, scope, and operational framework of this task force are not available in the provided information.

newest submissions : artificial
Technology

GPT-5.1 and Specialized Codex Models Now Accessible via API with GPT-5 Pricing; Enhanced Prompt Caching Introduced

OpenAI has announced the immediate availability of GPT-5.1 through its API, maintaining the same pricing structure as GPT-5. Alongside this release, two new specialized models, gpt-5.1-codex and gpt-5.1-codex-mini, have also been launched in the API, specifically designed for handling long-running coding tasks. A significant improvement in API functionality is the extension of prompt caching duration, which now persists for up to 24 hours. Further details and updated evaluations are available in the company's blog post.

twitter-Sam Altman
Technology

OpenAI Releases GPT-5.1 API: Faster, More Steerable, and Enhanced for Coding with New Tools

OpenAI has announced the immediate availability of GPT-5.1 in its API. This new iteration is touted as a significant upgrade, offering improved speed, enhanced steerability, and superior coding capabilities. It also comes equipped with practical new tools. OpenAI suggests that GPT-5.1 will be particularly beneficial for developers creating applications or agents where intelligence, speed, and cost-effectiveness are critical factors, promising a meaningful upgrade for such use cases.

OpenAI Developers(@OpenAIDevs) - OpenAI Developers (@OpenAIDevs)
Product

OpenAI Announces GPT-5.1 for Developers: A New Era of AI Innovation

OpenAI has announced the release of GPT-5.1 for developers, signaling a significant update in their large language model offerings. This announcement, published on November 13, 2025, on Hacker News, points to an official page on OpenAI's website dedicated to this new version. While specific features and enhancements are not detailed in the provided snippet, the release is expected to provide developers with advanced capabilities for integrating cutting-edge AI into their applications and services. The news has garnered initial attention with 34 points and 1 comment on Hacker News, indicating early interest within the tech community.

Hacker News: Front Page - tedsanders
Industry News

Cloudflare CEO Accuses Google of Abusing Search Monopoly to Fuel AI Development

The CEO of Cloudflare has reportedly accused Google of leveraging its dominant position in the search engine market to benefit its artificial intelligence initiatives. This claim, submitted by u/fortune, suggests a potential misuse of market power by Google to advance its AI capabilities.

newest submissions : artificial
Product

Qwen DeepResearch 2511 Launches with Major Upgrade: Deeper Analysis, Faster Research, and Enhanced User Experience

Qwen has announced the live launch of Qwen DeepResearch 2511, a significant upgrade designed to make research deeper, faster, and smarter. Key new features include Dual Mode Selection, offering both a Normal Mode for efficiency and an Advanced Mode for more thorough analysis. Users can now upload documents and images for AI analysis, and the platform boasts boosted search power for improved efficiency and depth in processing web information. Additionally, Precise Report Control allows users to command report format, word count, paragraphs, and content, with enhanced citation reliability. The update also introduces an all-new, smoother, and more responsive user experience thanks to a decoupled architecture.

Qwen(@Alibaba_Qwen) - Qwen (@Alibaba_Qwen)
Technology

Google DeepMind's SIMA 2 Demonstrates Unprecedented Adaptability in Genie 3 Simulated 3D Worlds

Google DeepMind has announced the successful testing of SIMA 2 within simulated 3D environments generated by their world model, Genie 3. SIMA 2 showcased remarkable adaptability, effectively navigating its surroundings and making significant progress towards predefined objectives. This development highlights advancements in AI's ability to interact and achieve goals within complex virtual settings, as detailed in a recent update from Google DeepMind.

Google DeepMind(@GoogleDeepMind) - Google DeepMind (@GoogleDeepMind)