Skip to content

Commit

Permalink
Merge pull request #352 from skynettoday/digest_243
Browse files Browse the repository at this point in the history
Digest 243
  • Loading branch information
jacky-liang authored Oct 29, 2023
2 parents 4daf5b9 + b6a14ad commit 4fcaea4
Show file tree
Hide file tree
Showing 5 changed files with 234 additions and 34 deletions.
6 changes: 5 additions & 1 deletion _posts/digests/2023-10-23-242.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,14 +18,18 @@ redirect: https://lastweekin.ai/p/242

Amazon is set to begin testing Agility's bipedal robot, Digit, in its nationwide fulfillment centers, marking a significant step in the application of humanoid robots in industrial settings. This follows Amazon's inclusion of Agility as one of the first five recipients of its $1 billion Industrial Innovation Fund. While Amazon Robotics has primarily focused on wheeled autonomous mobile robots (AMRs), the company is exploring the potential of legged locomotion, particularly for navigating diverse terrains. The integration of humanoid robots into Amazon's operations could significantly impact the trajectory of the robotics industry, particularly if they prove successful at scale. However, the company is also considering other mobile manipulation solutions, such as mounting a robot arm on an AMR. The success or failure of the Digit pilots could have far-reaching implications for the future of bipedal robots.

#### [State of AI Report 2023](https://www.stateof.ai/)

The State of AI Report 2023 highlights the dominance of Large Language Models (LLMs) in AI research, with significant advances in transformers surprising the AI community. The report discusses the rise of OpenAI's GPT-4 and the increasing reliance on computational power, alongside the thriving open-source community. However, the report also notes new tensions around openness due to commercial and safety concerns. Despite the focus on LLMs, the report also covers progress in other AI fields like navigation, weather prediction, self-driving cars, and music generation. Key takeaways include GPT-4's dominance, efforts to clone or surpass proprietary performance, real-world breakthroughs driven by LLMs and diffusion models, the importance of compute power, the rise of generative AI applications, the mainstreaming of the safety debate, and the challenges in evaluating state-of-the-art models.

#### [Adept Releases Fuyu-8B for Multimodal AI Agents](https://analyticsindiamag.com/adept-releases-fuyu-8b-for-multimodal-ai-agents/)
![](https://149695847.v2.pressablecdn.com/wp-content/uploads/2023/10/Adept-1536x864-1.jpg)

Adept has launched Fuyu-8B, a scaled-down version of their multimodal AI model, designed to understand charts, documents, and diagrams with improved OCR capabilities. The model, which is now accessible through HuggingFace, offers a simplified architecture and training process, making it more accessible and scalable. Fuyu-8B is tailored for digital AI agents, excelling in handling arbitrary image resolutions, answering queries related to graphs, diagrams, and UI-based questions, and delivering responses for large images in under 100 milliseconds. Despite its optimization for specific applications, it performs well in standard image understanding benchmarks. The model uses a vanilla decoder-only transformer, eliminating the need for a separate image encoder and simplifying its structure. In evaluations on prominent image-understanding datasets, Fuyu-8B demonstrated robust performance, outperforming models like QWEN-VL and PALM-e-12B on multiple metrics.

#### [4K4D: Real-Time 4D View Synthesis at 4K Resolution](https://zju3dv.github.io/4k4d/)


This paper proposes a new method for real-time view synthesis of dynamic 3D scenes at 4K resolution, called 4K4D. The method uses a 4D point cloud representation that supports hardware rasterization, resulting in faster rendering speeds. The authors also introduce a hybrid appearance model that enhances rendering quality while maintaining efficiency. They also develop a differentiable depth peeling algorithm to effectively learn the model from RGB videos. The method can render novel view videos at over 400 FPS on the DNA-Rendering dataset at 1080p resolution and 80 FPS on the ENeRF-Outdoor dataset at 4K resolution using an RTX 4090 GPU, which is 30x faster than previous methods, achieving state-of-the-art rendering quality.

### Other News
#### Applications
Expand Down
Loading

0 comments on commit 4fcaea4

Please sign in to comment.