Zoombot AI: Enhancing Virtual Meeting Experiences
This presentation documents the industry internship undertaken at Webelight, a technology company specializing in innovative software solutions. The internship focused on contributing to the development of ZoomBot AI, an intelligent meeting assistant designed to enhance the Zoom meeting experience through real-time audio analysis and AI-driven features.
During this 12-week internship, I actively participated in various stages of the ZoomBot AI project, including understanding the system architecture, contributing to feature development, addressing technical challenges, and exploring future enhancements for this cutting-edge solution.
Manav Shah

by Manav Shah

Last edited 5 minutes ago

About Webelight Solutions
Company Profile
Established in 2014, Webelight Solutions has rapidly emerged as a dynamic force in the global technology landscape, employing between 51-200 professionals.
Core Services
Custom software development and comprehensive digital marketing solutions including Social Media Marketing, SEO, Email Marketing, and Online Advertising.
Technical Expertise
Specializes in SaaS Development, Web and Mobile Application Development, ReactJS, NodeJS, DevOps, and AI/ML Solutions.
Webelight's mission centers around empowering organizations to create and implement cutting-edge digital solutions through its profound expertise. The company fosters a culture of innovation, collaboration, and client focus, promoting continuous learning and adaptation in rapidly evolving technological fields.
Internship Objectives

Personal Growth
Develop professional skills and industry exposure

Technical Skill Development
Learn relevant technologies and implementation

Project Contribution
Actively participate in ZoomBot AI development
The primary objectives included understanding ZoomBot AI's system architecture, contributing to feature development, learning relevant technologies (Zoom SDKs, WebSocket, cloud services), addressing technical challenges, and contributing to system optimization.
Personal and professional growth objectives focused on enhancing technical skills, developing problem-solving abilities, improving teamwork and collaboration, gaining industry exposure, and enhancing communication skills.
Technical Skills Acquired
Zoom SDK Integration
Experience with Zoom Meeting SDK (Linux and Web) and Zoom App Marketplace integration for bot participation and user interface.
Real-time Audio Processing
UDP for audio streaming, RMS for silence detection, and audio data handling for transcription and analysis.
Cloud-Based AI Services
Integration with Google Cloud Speech-to-Text and OpenAI GPT-4o-mini for transcription and content generation.
Database & API Integration
PostgreSQL for data storage and WebSocket communication for real-time updates.
Professional Development
Teamwork
Collaborating effectively with the Webelight team to achieve project goals

Communication
Preparing reports, participating in meetings, and presenting technical concepts

Problem-Solving
Identifying and resolving technical challenges systematically

Time Management
Meeting deadlines and organizing work schedule effectively

Beyond technical skills, this internship significantly contributed to my professional development. I enhanced my adaptability and learning agility in a fast-paced environment, developed a strong work ethic, and gained valuable experience in workplace etiquette and responsibility.
ZoomBot AI: Project Overview
Real-time Audio Recording and Transmission
Captures both merged and individual audio streams using Zoom SDK (Linux) and transmits them in real-time via UDP for processing.
Silence Detection and Icebreaker Questions
Utilizes RMS analysis to detect silence and generates contextual icebreaker questions using OpenAI GPT-4o-mini.
Live Transcription and Summary Generation
Provides real-time transcription using Google Cloud Speech-to-Text and generates meeting summaries in English and Arabic.
Meeting Event Handling
Tracks real-time meeting events using Zoom's WebSocket Event API to update system status and user interface.
System Architecture and Data Flow

Zoom Meeting Participants
Users interact with ZoomBot through Zoom interface

Main Server Processing
Handles audio, transcription, and AI features

Cloud Services Integration
Google Speech-to-Text and OpenAI GPT-4o-mini

4

Data Storage and Retrieval
PostgreSQL for meeting data persistence
The architecture is designed to be modular and scalable, integrating various components to ensure smooth and efficient functionality. Data flows from Zoom meetings through the Main Server for processing, utilizing cloud services for AI features, and storing results in PostgreSQL for future retrieval.
Technical Challenges and Solutions
Real-time Audio Transmission
Challenge: Packet loss and latency issues inherent in UDP transmission affecting audio quality.
Solution: Implemented redundant packet transmission, buffering, and jitter compensation to optimize UDP socket settings for low-latency transmission.
Silence Detection Accuracy
Challenge: Background noise affecting RMS values, leading to false silence detection.
Solution: Applied noise filtering techniques, used dynamic silence thresholds, and introduced rolling average window to avoid sudden false detections.
WebSocket Reliability
Challenge: Frequent WebSocket disconnections causing loss of real-time updates.
Solution: Implemented auto-reconnect logic with exponential backoff, used heartbeat messages, and set up a message queue to ensure missed events are re-sent.
Future Scope and Enhancements
Sentiment Analysis
Detect emotional tone of participants to understand meeting mood

Action Item Detection
Automatically identify and track tasks from meeting discussions

Speaker Identification
Enhanced transcription with speaker diarization

Multi-language Expansion
Support for additional languages beyond English and Arabic

Additional enhancements could include integration with other collaboration platforms, customizable icebreaker questions, real-time feedback for speakers, and enhanced summary customization. Performance improvements would focus on optimizing the audio processing pipeline, enhancing transcription speed, and scaling infrastructure for higher loads.
Conclusion and Key Takeaways
12
Weeks
Of intensive hands-on experience in AI and software development
4+
Key Features
Contributed to silence detection, transcription, and WebSocket communication
10+
Technologies
Gained experience with Zoom SDKs, cloud AI services, and real-time systems
This internship at Webelight has been an immensely valuable and transformative experience. The opportunity to contribute to the ZoomBot AI project provided practical experience that complements academic learning and has solidified my interest in AI and software development.
I am deeply grateful to Webelight for their support and guidance throughout this journey. This experience has equipped me with the confidence and skills to pursue a successful career in the technology industry.