AI News on November 17, 2025

Technology

Qwen-Edit-2509-Multi-angle Lighting LoRA Released by Qwen for Enhanced Image Editing Capabilities

Qwen has announced the release of 'Qwen-Edit-2509-Multi-angle lighting LoRA,' a new tool designed to enhance image editing. The announcement, made via Twitter by @Qwen - Qwen, highlights the availability of this LoRA (Low-Rank Adaptation) model. Users can download 'Qwen-Edit-2509-Multi-angle lighting LoRA' from Hugging Face, with the download link provided as https://huggingface.co/dx8152/Qwen-Edit-2509-Multi-Angle-Lighting. This release is attributed to '大雄' and is associated with @Ali_TongyiLab.

Twitter @Qwen - Qwen
Technology

Elon Musk Announces 'Just Grok 4': AI Demonstrates Emergent Intelligence by Redesigning Edison Lightbulb Filament

Elon Musk, via Twitter, announced 'This is just Grok 4,' highlighting a significant advancement in AI. The announcement follows a demonstration where Grok analyzed Thomas Edison's 1890 lightbulb patent, subsequently determining and implementing a superior filament design that successfully illuminated a light. This emergent intelligence, described as unique among current AI models, has been noted for its potential to revolutionize education and enable robots to construct.

twitter-Elon Musk
Technology

DeepMind Unveils SIMA 2: A Gemini-Powered AI Agent Capable of Reasoning, Learning, and Playing in Diverse 3D Virtual Worlds, Advancing Towards Embodied AGI

DeepMind has launched SIMA 2, an advanced version of its Scalable Instructable Multiworld Agent, significantly evolving from its predecessor. While SIMA 1 could execute over 600 language instructions across various 3D virtual worlds by observing screens and using virtual keyboard/mouse, SIMA 2, powered by the Gemini large language model, transcends mere execution. It can now reason about user goals, explain its plans and thought processes, learn new behaviors, and generalize experiences across multiple virtual environments. This leap is driven by a Gemini-integrated core that combines language, vision, and reasoning, enabling SIMA 2 to understand high-level tasks, translate natural language into action plans, and explain its decisions in real-time. Trained through human demonstrations and AI self-supervision, SIMA 2 demonstrates remarkable cross-game generalization, applying learned concepts to new tasks and operating in previously unseen commercial open-world games. It also supports multimodal instructions and can autonomously navigate and complete tasks in dynamically generated 3D worlds, showcasing a self-improvement loop for continuous learning without human feedback. DeepMind positions SIMA 2 as a significant step towards Embodied General Intelligence.

Xiaohu.AI 日报
Technology

Saudi AI Startup Humain Unveils 'Humain One' AI Operating System, Revolutionizing Computer Interaction with Natural Language Commands

Saudi Arabian AI startup Humain has officially launched 'Humain One,' a new AI operating system, at the 9th Future Investment Initiative conference in Riyadh. This system aims to replace traditional icon-based operating systems like Windows and macOS, allowing users to interact with computers through natural language commands to complete various tasks. Humain CEO Tariq Amin stated that the company is redefining enterprise computing by creating an AI partner that understands user goals, anticipates needs, and autonomously executes tasks. The operating system, driven by Humain's agent orchestration engine and powered by the Arabic-centric language model 'Allam,' is designed to enhance productivity and creativity across enterprise roles. This launch aligns with Saudi Arabia's accelerated push for AI development, aiming for a leading global market position. Humain, established in May by the country's sovereign wealth fund with Crown Prince Mohammed bin Salman as chairman, is part of Saudi Arabia's 'Vision 2030' to become a 'global AI powerhouse.'

AI新闻资讯 - AI Base
Technology

AI Agent Aggregation Platform MuleRun 2.0 Surpasses 500,000 Users in First Month, Led by US Adoption

MuleRun, an AI Agent aggregation platform, announced that its 2.0 version has attracted over 500,000 registered users globally within just one month of its launch, with the United States accounting for the highest percentage of users. The platform introduces an innovative 'Agent Team' model, allowing users to select a professional identity, after which the system recommends and enables the assembly of multiple vertical Agents. These Agents collaborate to complete complex tasks such as e-commerce operations, data analysis, and content creation. MuleRun 2.0 currently integrates hundreds of applications, including Alibaba International Station's PicCopilot, Quick BI official report analysis Agent, and Sora video generation, covering scenarios like product image generation, anomaly detection, and short video production. The platform supports Python/SQL code traceability, claims 'zero hallucination risk,' and plans to launch subscription payment and enterprise private deployment solutions next month. Analysts suggest MuleRun consolidates disparate AI capabilities into a 'super toolbox,' lowering the barrier for general users, but highlights the need to address compliance and copyright risks.

AI新闻资讯 - AI Base
Technology

Google's Gemini Veo 3.1 Launches 'Ingredients to Video' Mode for Pro/Ultra Subscribers: Create 8-Second 1080p Videos from Three Reference Images with Consistent Characters and SynthID Watermarks

Google has rolled out the Veo 3.1 video model to Gemini Pro/Ultra subscribers, introducing a new 'Ingredients to Video' mode. This feature allows users to upload three reference images simultaneously to extract character, scene, and style characteristics, which are then merged into an 8-second, 1080p video. The generated content includes an invisible SynthID watermark. Users can create videos via text prompts on web or mobile, with the system maintaining cross-frame character consistency and lighting coherence. Google demonstrated this by combining three selfies, a cyber city background, and an oil painting style image to produce a 'futuristic impressionist street walk' short film with no facial or clothing deformation. Veo 3.1 also outputs native environmental sound and supports first/last frame control and video extension. The multi-image reference feature is fully available, utilizing existing subscription quotas without additional payment plans announced.

AI新闻资讯 - AI Base
Industry News

Alibaba Launches 'Qianwen' App for Public Beta, Leveraging Qwen3 to Challenge ChatGPT in the AI-to-Consumer Market with Free Access and Ecosystem Integration

On November 17th, Alibaba officially launched the public beta of its 'Qianwen' app, marking its full entry into the AI-to-C market. Based on the globally top-performing open-source model Qwen3, Qianwen aims to compete directly with ChatGPT by offering free access and integrating with various lifestyle scenarios. Alibaba's core management views this project as the 'future battle of the AI era.' The app is now available on major app stores, with web and PC versions also provided. An international version is slated for release soon to leverage Qwen's global influence and compete for overseas users. Alibaba has invested 380 billion yuan in AI infrastructure this year and plans to expand cloud data center energy consumption tenfold by 2032. Qwen has become the most powerful and widely used open-source large model globally, with over 600 million downloads. Its flagship model, Qwen3-Max, reportedly surpasses GPT5 and Claude Opus4 in performance, positioning it among the top three globally. Qianwen's strategic goal is to become a future AI lifestyle portal, offering smart chat and practical task-handling capabilities, such as generating research reports and PPTs. Alibaba plans to integrate various life scenarios like maps, food delivery, ticketing, and shopping into the app.

AI新闻资讯 - AI Base
Product

Xiaomi Unveils Open-Source 7B Multimodal Model MiMo-VL and AI Butler Miloco for Automated Smart Home Control

Xiaomi has launched its 7B parameter multimodal large model, 'Xiaomi-MiMo-VL-Miloco-7B-GGUF,' on Hugging Face and GitHub, alongside an AI butler named 'Xiaomi Miloco.' This system leverages Mijia cameras to identify user activities like gaming, fitness, or reading, and gestures such as victory signs or thumbs-up. Miloco then automatically controls smart home devices including lights, air conditioners, and music, while also supporting the Home Assistant protocol. Operating under a non-commercial open-source license, Miloco can be deployed with a single click on Windows or Linux hosts equipped with NVIDIA GPUs and Docker. Examples include automatic desk lamp activation for reading, climate control adjustments based on bedding during sleep, and personalized voice comments upon entry based on clothing style. Xiaomi has released the model weights and inference code but retains intellectual property, prohibiting commercial use.

AI新闻资讯 - AI Base
Technology

Google's Gemini 3.0 Set for Late 2025 Launch, Aiming to Challenge ChatGPT with Major Breakthroughs in Code Generation and Multimodal AI

Google CEO Sundar Pichai has confirmed the official release of the Gemini 3.0 large language model by the end of 2025. This new iteration is expected to deliver significant advancements in code generation, multimodal creation, and reasoning capabilities, sparking considerable discussion within the global AI community. Gemini 3.0 will integrate an upgraded image generation engine, Nano Banana, to compete with Sora and DALL·E, and will feature enhanced multi-language, multi-file collaborative coding and debugging. Leveraging Google's TPU v5 chips and Vertex AI, it aims for improved response speed and cost efficiency. Despite Gemini's 650 million monthly active users, it trails ChatGPT's 800 million weekly active users. Google's strategy involves deep integration with Android 16, Pixel devices, Workspace, and Google Cloud to create a comprehensive AI ecosystem, with the goal of transforming users into deep Gemini adopters and reclaiming leadership in generative AI.

AI新闻资讯 - AI Base
Technology

Google Gemini Update: New Multi-Reference Image Feature Empowers Users with Enhanced Control Over AI Video and Audio Generation

Google has rolled out an update for its Gemini application, introducing a novel method for AI video generation control. Users can now upload multiple reference images within a single video prompt, allowing the system to generate video and audio based on these images and accompanying text. This new functionality grants users more direct control over the final visual and auditory output of their videos. Previously, Google had tested this feature within its extended video AI platform, Flow, which offers higher video quotas and supports extending existing video clips and stitching multiple scenes. The Veo 3.1 version, released in mid-October, reportedly shows significant improvements in texture realism, input fidelity, and audio quality compared to Veo 3.0. This update aims to provide creators with greater flexibility and customization in AI-powered content creation.

AI新闻资讯 - AI Base
Technology

AI Prompt Engineering: How to Avoid Being Perceived as a 'Low-Intelligence PhD Student' by AI

A recent social media post highlights an effective AI prompt for explaining complex topics. The original prompt, praised for its good results, is 'Please help me explain this paper in simple terms to high school students.' The author suggests this phrasing helps avoid being perceived by AI as a 'low-intelligence PhD student.' Another user commented that this prompt is applicable to 'all new field research,' indicating its broad utility for simplifying advanced concepts for a general audience.

twitter-宝玉
Research

Groundbreaking Research Reveals Short-Form Video Platforms Like TikTok and Instagram Are Altering Human Brains and Cognition, According to Griffith University Study

A significant study by a research team at Griffith University in Australia has found that short-form video platforms such as TikTok, Douyin, and Instagram are subtly changing the human brain. Analyzing 71 studies involving 98,299 participants, the research investigated the relationship between short-form video use and 'cognition' and 'mental health.' Key findings indicate that increased short-form video consumption correlates with poorer overall cognitive levels, severe declines in attention span, significantly weakened self-regulation, and reduced memory capacity. The study highlights how constant exposure to fast-paced content habituates the brain, making it prone to distraction during slower tasks and over-activating the dopamine reward system, leading to a pursuit of instant gratification and a decline in patience and deep thinking. Furthermore, it details negative emotional effects, including emotional regulation imbalance and intensified social comparison, contributing to increased anxiety, loneliness, and disrupted sleep patterns.

twitter-小互
Technology

Michael Dell Praises Grok 5 AI Model: Cites 6 Trillion Parameters and High Intelligence Density, Anticipates Exciting 2026

Michael Dell has retweeted a statement highlighting the impressive specifications of the Grok 5 AI model. According to Dell, Grok 5 boasts a 'massive 6 trillion parameters' and exhibits 'much higher intelligence density.' He expressed his anticipation for 2026, suggesting it 'is going to be exciting!' Dell also extended thanks to Elon Musk in his statement, indicating a connection to the development or announcement of the Grok 5 model.

twitter-Elon Musk