A simple android based cloud-assistant (Google Gemini powered) through FLASK based server.
User interacts with voice. Query is extracted to text and routed to FLASK server. FLASK server prompts the LLM with query and send the response back to App. App 'responds' to the user in voice.
[GET request for text query working. POST request for image query in progres.]
[On FLASK server, user can send a URL to recieve summary of the text on the webpage.]