AI in Motion 24-hour hardware hack recap
If 2023 was the year of LLMs (large language models), then 2024 will be the year of LMMs (large multimodal models). The main difference will be the recognition of text and images for generating inputs and outputs. This opens up a whole new set of possibilities for hardware.
To explore what’s possible in combining the latest hardware with the latest machine learning models, we hosted a weekend-long hackathon at Studio 45 in San Francisco. The main intention was to bring together two communities that are rather separate right now: the machine learning community and the robotics community. Hackers came together to see what spatial challenges they could solve. Everyone had 24 hours to make a team, build a demo, and pitch judges at the end.
Smarter interfaces, GPT-4 with eyes, and an open source model from DeepMind
In the two weeks around the hackathon, there are new developments in the realm of hardware and AI:
- Investment in smarter interfaces: Meta released Ray-Ban glasses with a streaming setup. OpenAI is in talks with Jony Ive for an iPhone-like replacement using their latest models. Prior in the year, Humane demoed their AI pin on a TED stage, and Apple Vision Pro headset has released its SDK with visionOS as people build spatial applications.
- OpenAI gives Chat GPT-4 vision: Chat GPT-4V (vision) is now on premium accounts letting people chat with images as well as Dalle-3 for generated content. Some use cases include taking a front-end mockup and writing back-end code, making movie stills, or even figuring out confusing street signs. More examples here and Microsoft’s full 166-page paper on GPT-4V here.
- Google DeepMind open sources RT-X: Benchmarking over 500 skills on over 150,000 tasks, the RT-X model out-performed traditional narrow intelligence models, with more here.
While we missed these exciting updates by two weeks, this is exactly the reason behind running the AI in Motion hackathon in San Francisco: 1) the pace of machine learning advancement is happening so quickly that we want to see what happens when integrated better with hardware and 2) we wanted to bring two communities together that don’t often get to hack together.
Our main goal was bringing together AI + hardware
To structure the hackathon, we first started with three goals:
Goal 1: Ensure the group is ½ machine learning developers and ½ hardware engineers.
We achieved: Hackers from OpenAI, DeepMind, Meta AI Labs, Tesla, and more came with experience in both backgrounds.
Goal 2: Bring in some awesome hardware to see what the latest machine learning models could do. Yes, there were more than LLMs used in the making of these demos.
We achieved: We had universal robotic arms, a Boston Dynamics Spot quadruped, Roombas, and a whole IoT kit library. Here’s the full documentation we gave hackers. Look out for an open source document soon!
Goal 3: Go from idea to demo in 24 hours. We launched Saturday at 10 a.m. and demos were finalized by 10 a.m. on Sunday. Yes, many still got to sleep.
We achieved: Check out the results below!
Overall, the results were impressive. It felt like a taste of where the very near future will be drastically different with smarter interfaces, more capable hardware, and more importantly, smaller teams shipping bigger builds.
Top 5 project highlights
1.. Jarvis is a robo-mechanic assistant. Think Tesla manufacturing arm, but in a small-scale garage, and one you can talk to.
h/t to the hackers: @jqphu, @nishthenomad, @TristanHeywood, @The_TT_Hacker , @winston, @vrushank
2. XR is a smart learning hearing aid that can capture surrounding information, like who’s around the user and what’s being done. Combining camera input with computer vision and voice recognition with OpenAI’s API, their demo could enhance a user’s visual understanding of the world around them.
h/t to the hackers: @jer, @EmmaQian_ , @ClovisVinant, @lingxue, @varun, @esh
3. C.H.I.P. is a digital CNC microscope with zero-shot classification for bad-chip detection.
h/t to the hackers: @johndmcmaster, @notionsmith, @justin, @ninjaa
4. Dex (overall winner) scans and queries a room for lost objects. They added a webcam to a Roomba and made it so you could chat with the image data.
h/t to the hackers: @cyrus_cowley, @ian, @surya
5. Spotsight (the crowd favorite) was a robotic seeing-eye dog for the visually impaired. It could both help navigate an environment safely, at a more affordable price than professionally trained seeing-eye dogs, as well as do additional tasks between owner and environment, like get the mail.
h/t to the hackers: @ingarobotics, @cyb3rblaze_, @adit, @reuben, @abinaya
🤖 More projects here if you’re curious.
Ideas to improve on the next one
From idea to storyboard. We discovered a gap from teams finding a clear direction in what problem they generally wanted to solve and how they would demo that in 24 hours. For future hacks, we’ll promote two parts of the storyboard: 1) what you want the robot to do and why and 2) what your back-end architecture will look like. We course-corrected on the spot and can preemptively help with this sooner next time.
- Multi-part hack. The most challenging hackathons at MIT can take two weekends instead of one sitting. as it can take a full weekend to explore an idea and figure out what the demo will be, with a second weekend for the finalists to take time to build. We may explore this for the next challenge to see if it helps promote higher-quality demos. In this case, we’d keep the first weekend to one day, and the second weekend would be two days.
- Simplify the hardware library. We’re going to explore the two-weekend model with a more scoped-down library. It was a good problem to have too much hardware. But with each robot having a mentor, we think it may be better to do a scoped-down challenge like one with the Boston Dynamics Spot quadruped or the universal robotic arms. If we did this, we’d have to explore timing for teams to use the hardware or get multiple bots.
This was a fun event both for hackers and for those who came for the final showcase. We need more hardware hackathons!
To build great hardware, it really does take a full ecosystem, even if only for 24 hours. and we’re grateful to all our sponsors who helped make this event happen, especially informal, which played a pivotal role!
If you want to run your own hardware hackathon, in a future post, we’ll be sharing tactical tips on how to run a hardware hackathon. Stay tuned!
Thank you to all the partners who helped make this happen!
- Studio45 is a clubhouse and coworking space for professionals building physical products in the Bernal Heights neighborhood of San Francisco.
- informal is a freelance collective for the best independent professionals in hardware and manufacturing. informal members work with companies at every scale to design, manufacture, and ship physical products.
- Blues Wireless makes cloud-connected products actually possible with out-of-the-box connectivity.
Community partners
- Massmelt is a diverse set of hardware product-development services dedicated to supporting organizations as they build their vision of the future.
- SF Hardware Meetup is a community of 9,000+ hardware professionals meeting monthly to build meaningful connections.
- Cerebral Valley and GenAI Collective help spread the word to the Bay Area machine-learning community.
Prize sponsors
- Runpod is a GPU cloud platform for training and scaling inference on AI models. They offered cloud credits to the top team.
Thank you to our judges for helping comment on the projects!
- Ashley is former tech lead at Google Alphabet’s X, Moonshot Factory.
- Robert is a product lead at Waymo, a self-driving company within Alphabet.
- Santhi is founder, angel investor, and venture fellow at Designer Fund.
- Vince is a 7x entrepreneur and Silicon Valley impact tech investor.
And last, but not least, shout out to the co-organizers: Michael + Jascha, both members of informal, for volunteering to make this event a great experience for everyone who joined..
If you have a desire to build an AI-enhanced hardware project, you should reach out to informal to help!
P.S. Yes, this article was written by us (Michael + Jascha), and so we did end with a third person shout out at the end ✌️