echemi logo
Product
  • Product
  • Supplier
  • Inquiry
    Home > Biochemistry News > Biotechnology News > The team of Zhu Songchun and Zhu Yixin of the Institute of Artificial Intelligence reconstructed the robot scene and used motion information to help the robot autonomously plan...

    The team of Zhu Songchun and Zhu Yixin of the Institute of Artificial Intelligence reconstructed the robot scene and used motion information to help the robot autonomously plan...

    • Last Update: 2022-10-20
    • Source: Internet
    • Author: User
    Search more information of high quality chemicals, good prices and reliable suppliers, visit www.echemi.com
      

    Recently, the team of Professor Zhu Songchun and Zhu Yixin of the Institute of Artificial Intelligence published a paper "Scene Reconstruction with Functional Objects for Robot Autonomy" at IJCV 2022, which proposed a new scene reconstruction problem and scene diagram sign, providing necessary information for robot autonomous planning, and providing an interactive virtual scene similar to the real scene
    function for its simulation test 。 At the same time, this work also develops a complete machine vision system to realize the proposed scene reconstruction problem
    .
    Experiments demonstrate the effectiveness of the proposed scene reconstruction method and the potential of
    scene graph in robot autonomous programming.

    Perceiving the three-dimensional environment and understanding the information contained therein is an important manifestation of human intelligence and a prerequisite
    for humans to interact with the environment at will.
    In addition to the geometric characteristics of the environment and the semantic information of objects, we can also "perceive" potential interactions with the environment, which we call actionable information
    in the environment 。 For example, when we see the doorknob in Figure 1(a), we naturally have the potential action of turning the doorknob and pulling the door open in our minds, while in the scenario in Figure 1(b), we can easily observe the constraint relationship between stacked teacups and dishes (supporting each other) and the effect of different actions on their state (directly extracting the lower dishes will overturn the upper dishes and teacups, while removing the top object one by one can safely reach the lower dishes).

    Understanding the impact of potential actions on a scene forms the basis on
    which we perform tasks and interact with them.
    Correspondingly, intelligent robots need similar perceptual capabilities to enable them to autonomously complete complex long-horizon planning
    in the environment.

    Figure 1 (a) Doorknob, (b) Stacked teacups and dishes (image from the Internet, copyright belongs to the original author)

    With the maturity of 3D scene reconstruction and semantic mapping technology, robots have been able to effectively create three-dimensional maps containing geometric and semantic information, such as panoptic maps including objects and room structures, as shown in Figure 2(b).

    However, there is still an insurmountable gap
    between the scene representation of these traditional scene reconstructions and the realization of robot autonomous planning.
    So the question is, how can we construct a scene representation that is commonly used for robot perception and planning to improve the robot's autonomous planning ability? How can a robot use its own sensor inputs, such as an RGB-D camera, to establish such a representation of the scene in a real-world scenario?

    In this paper [1], the researchers propose a completely new research problem: reconstructing functionally-equivalent, interactive) virtual scenes that function functionally equivalent to real-world scenes to preserve the potential motion information
    of the original scene.
    The reconstructed virtual scene can be used for simulation training and testing
    of robot autonomous planning.
    In order to achieve this reconstruction task, the researchers proposed a scene diagram based on the relationship between supporting relation and proximal relation, as shown in Figure 2(a); Each node represents an object in the scene or a room structure (wall/floor/roof).

    This scene diagram organically organizes the reconstructed scene and the physical constraints contained therein to ensure that the resulting virtual scene is physically sound
    .
    At the same time, it can be directly converted into a kinematic tree of the environment, which completely describes the kinematic relationship state of the environment and supports forward prediction of the impact of robot actions on the environment, which can be directly used in
    robot planning tasks.
    This paper also proposes a complete machine vision system to achieve this reconstruction task, and designs an output interface for the reconstructed scene, so that it can be seamlessly integrated into robot simulators (such as Gazebo) and VR environments
    .
    Part of the preliminary work on this paper[2] was published at
    ICRA 2021.

    Figure 2(a) A scene diagram based on the relationship between support and immediate adjacency, (b) a volumetric semantic panorama, and (c) an interactive virtual scene with the same function as the real scene, which can be used for simulation testing of robot autonomous planning

    Reconstructing real-world scenarios in a virtual environment to support robot simulation is not a simple problem
    .
    There are three main difficulties: first, how to accurately reconstruct and divide the geometry of each object and structure in a messy real scene, and estimate the physical constraints between objects (such as support relationships, etc.
    ); The second is how to replace the reconstructed incomplete geometry with complete, interactive objects (such as CAD models); The third is how to organically integrate all this information into a general scene expression, while helping scene reconstruction and robot autonomous planning
    .

    This work proposes to use a special scene diagram as a bridge to connect the scene to rebuild the interaction with the robot, which can help reconstruct the virtual scene in line with physical common sense and provide the necessary information
    for the robot's autonomous planning.
    On the one hand, this scene graph organizes the perceived objects, room structures, and the relationships between them in the scene, as shown
    in Figure 3(a).
    Each node represents the object or room structure in the identified and reconstructed real scene, including its geometry (such as the reconstructed three-dimensional mesh (mesh), the three-dimensional minimum bounding box, the extracted plane features, etc.
    ) and semantic information (such as instances and semantic labels); Each edge represents the support relationship between the nodes [see the directed edge in Figure 3 (a)] or the immediate neighbor relationship [the undirected edge in Figure 3 (a)], representing some physical constraint information
    .
    For example, for the support relationship, the parent node needs to contain a horizontal support surface to achieve stable support for the child node; For example, for the immediate adjacency relationship, the three-dimensional geometry of two nodes close to each other should not overlap each other
    .
    On the other hand, according to the semantic and geometric similarity and comprehensive consideration of the constraints between nodes, the nodes in Figure 3 (a) are replaced with geometrically complete and interactive CAD models [including multi-articulated CAD models], and then a virtual scene that can be used for robot simulation interaction is generated, as shown in Figure 3 (b).

    Such a virtual scene retains the functionality of the real scene as much as possible within the scope of perceptual ability, that is, the potential action information, which can effectively realize the simulation
    of the interaction results with objects in the real scene.
    Correspondingly, the obtained scene diagram also contains a complete description of environmental kinematics and constraint states, which can be used to predict the short-term quantitative impact of robot actions on kinematic states and help robot motion planning, as well as estimate the long-term qualitative impact of robot actions on constraint relationships and support robot task planning
    .

    Figure 3 (a) Directly reconstructed scene diagram, (b) Interactive scene diagram after replacing the CAD model

    Figure 4 Flow chart of machine vision system for reconstruction tasks

    In order to achieve the above reconstruction task, the authors designed and implemented a multi-module machine vision system: a volumetric semantic panorama building module [Figure 4 (A)], and a CAD model replacement inference module based on physical knowledge and geometry [Figure 4 (B)].

    。 The former is used to robustly identify, divide and reconstruct the dense geometry of objects and room structures with the help of RGB-D cameras in complex real-world environments, and estimate the constraint relationship between them to obtain the scene diagram in Figure 3 (a).
    The latter focuses on how to select the most suitable CAD model from the CAD model library according to the geometric features of the reconstructed object and the recognized constraints, and estimate its pose and scale to achieve the most accurate alignment with the original object, and then generate the interactive scene diagram
    shown in Figure 3 (b).
    Figure 5 shows the results of the authors' reconstruction of a real office scene with the help of the Kinect2 camera, including a volumetric panoramic reconstruction [Figure 5 (a)], a common interactive virtual scene [Figure 5 (b)], and a sample of robot interaction after importing a virtual scene into the robot simulator [Figure 5 (c)].

    We can see that even in complex and multi-occlusion real scenes, the reconstruction system proposed in the paper can better establish interactive virtual scenes
    .
    Figure 5 (d-f) shows some interesting examples from this experiment: in Figure 5 (d), the same table is reconstructed into two relatively short tables due to the chair's obscuration of the table; Figure 5 (e) The station shown has been reconstructed of a relatively high quality, with all objects replaced with similar CAD models; The chair in Figure 5 (f) is not identified, and its obscuration of the table behind it creates a similar situation to Figure 5 (d), while the refrigerator and microwave oven in the scene are reconstructed and replaced with multi-joint, complex interactive CAD models
    .

    Figure 5 Reconstructed results with a Kinect2 camera in a real-world environment

    Figure 6 Robot task and action planning in a reconstructed virtual scene

    In the reconstructed interactive virtual scene, with the help of the motion chain and constraint information reflected in the scene graph, the robot can plan tasks and actions [3, 4], and its simulation effect is shown in
    Figure 6.
    In recent related work [5], based on the scene diagram described above, the robot can directly perform complex task planning based on the graph editing distance and efficiently generate actions
    .

    This work proposes a new scene reconstruction problem and scene diagram, which provides necessary information for robot autonomous planning, and provides an interactive virtual scene with similar functions to the real scene
    for its simulation test.
    At the same time, this work also develops a complete machine vision system to realize the proposed scene reconstruction problem
    .
    Experiments demonstrate the effectiveness of the proposed scene reconstruction method and the potential of
    scene graph in robot autonomous programming.

    In the future, we look forward to further expansion of this work: how to match rigid body and multi-joint CAD models with reconstruction geometry more robustly and accurately, how to incorporate more complex potential motion information in scene graphs, and how to make better use of scene suggestions for robot planning
    .
    Scene map reconstruction helps autonomous planning, and more intelligent robots are in the near future
    .

    References:

    [1] Han, Muzhi, et al.
    “Scene Reconstruction with Functional Objects for Robot Autonomy.
    ” 2022 International Journal of Computer Vision (IJCV), link.
    springer.
    com, 2022.

    [2] Han, Muzhi, et al.
    “Reconstructing Interactive 3D Scenes by Panoptic Mapping and CAD Model Alignments.
    ” 2021 IEEE International Conference on Robotics and Automation (ICRA), ieeexplore.
    ieee.
    org, 2021, pp.
    12199–206.

    [3] Jiao, Ziyuan, et al.
    “Consolidating Kinematic Models to Promote Coordinated Mobile Manipulations.
    ” 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), IEEE, 2021, doi:10.
    1109/iros51168.
    2021.
    9636351.

    [4] Jiao, Ziyuan, et al.
    “Efficient Task Planning for Mobile Manipulation: A Virtual Kinematic Chain Perspective.
    ” 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), ieeexplore.
    ieee.
    org, 2021, pp.
    8288–94.

    [5] Jiao, Ziyuan, et al.
    “Sequential Manipulation Planning on Scene Graph.
    ” 2022 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), ieeexplore.
    ieee.
    org, 2022.


    This article is an English version of an article which is originally in the Chinese language on echemi.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to service@echemi.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

    Contact Us

    The source of this page with content of products and services is from Internet, which doesn't represent ECHEMI's opinion. If you have any queries, please write to service@echemi.com. It will be replied within 5 days.

    Moreover, if you find any instances of plagiarism from the page, please send email to service@echemi.com with relevant evidence.