I have argued for awhile now that the probabilistic nature of LLMs can represent a form of context that, when applied, can be utilized for robotic applications.

Seems I’m not the only one that had this idea. While simple, Microsoft researched applied a high level control library to demonstrate LLMs (ChatGPT) developing robotic task code.

  • Lil' Bobby Tables@programming.dev
    link
    fedilink
    arrow-up
    1
    ·
    11 months ago

    Respectfully, this is some pretty trivial control code you’re showing and I imagine that the oversights GPT is known for could be dangerous or damaging for physical automation.

    What end robotics working environment would you see GPT assisting in, and how is it better than a qualified engineer?

    • hlfshell@programming.devOP
      link
      fedilink
      arrow-up
      1
      ·
      11 months ago

      This isn’t mine, it’s just an interesting blogpost I came across. Nor am I arguing that it should replace a robotics engineer.

      My main thought, not fully represented in the post, is that LLMs can act as a context engine for high level understanding of instructions + spatial awareness, and then apply it to actuation. This is somewhat touched upon in the article.

      I do think that there is some interesting work in LLM powered task level planning. I’m hoping to find the time put together a good example of this, utilizing the ability for LLMs to make logical leaps based on instruction. In the article, it took the command “I’m thirsty” to mean move to a drink. In a more applicable application, we can use LLMs to identify that a room with multiple identified objects (refrigerator, oven, stove, cabinets, etc) is in fact a kitchen. Then, from there, determine that “I’ve seen a room I’ve identified as a kitchen - I can navigate there to attempt to find a drink”.